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Status of Claims 

Claims 1-22 are pending in the application. 

Claims 21 and 22 stand allowed. 

Claims 5, 15, and 16 are objected to as being dependent upon a rejected base 
claim, but would be allowable if rewritten in independent form including all of the 
limitations of the base claim and any intervening claims. 

Claims 1-4, 6-14, and 17-20 are rejected based on prior art. 

A copy of the claims under appeal as now presented are appended to this brief in 
Appendix A. 

Status of Amendments 

All amendments to the claims have been entered. 

Summary of the Invention 

It is well known in the art that multiple-input, multiple-output (MIMO) systems 
can achieve dramatically improved capacity as compared to single antenna, i.e., single 
antenna to single antenna or multiple antenna to single antenna, systems. However, to 
achieve this improvement, it is preferable that there be a rich scattering environments, so 
that the various signals reaching the multiple receive antennas be largely uncorrelated. If 
the signals have some degree of correlation, and such correlation is ignored, performance 
degrades and capacity is reduced. 

The invention relates to a way of developing signals in a MIMO system such that, 
even in the face of some correlation, the most performance and capacity that can be 
achieved with a channel of that level of correlation is obtained. In accordance with the 
principles of the invention, the signals transmitted from the various antennas are 
processed so as to improve the ability of the receiver to extract them from the received 
signal. More specifically the number of bit streams that is transmitted simultaneously is 
adjusted, e.g., reduced, depending on the level of correlation, while multiple versions of 
each bit stream, variously weighted, are transmitted simultaneously. The variously 
weighted versions are combined to produced one combined weighted signal, a so-called 
"transmit vector", for each antenna. The receiver processes the received signals in the 
same manner as it would have had all the signals reaching the receive antennas been 
uncorrelated. 

In one embodiment of the invention, the weight vectors are determined by the 
forward channel transmitter using the channel properties of the forward link which are 
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made known to the transmitter of the forward link by being transmitted from the receiver 
of the forward link by the transmitter of the reverse link. In another embodiment of the 
invention the weight vectors are determined by the forward channel receiver using the 
channel properties of the forward link and the determined weight vectors are made known 
to the transmitter of the forward link by being transmitted from the receiver of the 
forward link by the transmitter of the reverse link. 

The channel properties used to determine the weight vectors includes the channel 
response from the transmitter to the receiver and the covariance matrix of noise and 
interference, e.g., the interference covariance matrix, measured at the receiver. 

Issues 

I. Are claims 17-19 properly rejected under 35 U.S.C. 102(b) as being anticipated 
by EP0 807 989 Al. 

II. Are claims 1-4 and 6-14 properly rejected under 35 U.S.C. 103(a) as being 
unpatentable over United States Patent No. 6,351,499 issued to Paulraj et al. on February 
26, 2002 in view of United States Patent No. 5,982,327 issued to Vook et al. on 
November 9, 1999. 



Grouping of Claims 

Claims 1-5 are a group of method claims. 

Claims 6-9 are a group of apparatus claims. 

Claims 10-16 are a group of apparatus claims. 

Claim 17 is an apparatus claim that is a group unto itself. 

Claim 18 is an apparatus claim that is a group unto itself. 

Claims 19 and 20 are a group of apparatus clams. 

Claims 21 and 22 are each a separate group that stands allowed. 

Claims 5, 15, and 16 are objected to as being dependent upon a rejected base 
claim, but would be allowable if rewritten in independent form including all of the 
limitations of the base claim and any intervening claims. 

Each group of claims, as well as each objected to claim, stand separately. 
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Arguments 

Issues I and II - Prior-Art-Based Rejections 

Claims 17-19 are rejected under 35 U.S.C. 102(b) as being anticipated by EP 0 

807 989 Al. 

Claims 1-4 and 6-14 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over United States Patent No. 6,351,499 issued to Paulraj et al. on February 26, 2002 in 
view of United States Patent No. 5,982,327 issued to Vook et al. on November 9, 1999. 

These grounds of rejections are respectfully traversed for the following reasons. 

Each claim that is rejected based on prior art recites the use of an interference 
covariance matrix in the use of the computation of weights. The cited references do not 
teach employing an interference covariance matrix in computing weights. 

Applicants note that some of the references compute the covariance matrix of 
signals. However, notwithstanding the Office Action's suggestion to the contrary, the 
covariance matrix of signals is quite different from the interference covariance matrix. 
Use of the interference covariance matrix, as recited in applicants' claims, 
advantageously, allows the invention to be employed in an environment in which the 
noise plus interference is not white, the interference contributing the not white 
component. This is not possible with the cited references. 

Nevertheless, the Final and Advisory Office Actions maintains that in teaching to 
use the covariance matrix of signals in computing weights the prior art already teaches 
applicants' recited limitation of employing an interference covariance matrix in 
computing weights. This is based, according to the Office Action, on giving the limiting 
term "interference covariance matrix" what the Office Action calls its broadest reasonable 
interpretation, since the term is not defined in the specification. Apparently then, 
according to the Office Action's broad interpretation, the term " interference covariance 
matrix" is so broad that it includes the "covariance matrix of signals ". 

The Office Action's analysis and conclusion are, quite simply, incorrect. 

The Office Action states that it recognizes that applicants may be their own 
lexicographer. However, applicants are not defining any terms, and instead are using the 
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terms in a well-accepted manner. Instead, it is the Office Action that acting as a 
lexicographer, in a manner that is not permitted. Indeed, the Office Action is attempting 
to state, analogously, that when applicants recite "red", they really mean "green". 

As is well known, in communication systems, "signals" are the desired 
information to be communicated, while "interference" and "noise", are the undesirable 
effects imparted by a channel onto the signal. Interference is caused by the signals of 
others, while noise is caused by nature, e.g., the motion of electrons as a result of 
temperature. As a result, noise is typically "white" in space, time and frequency, while 
interference generally is not "white". 

Thus, applicants' interference covariance matrix relates to the undesired. By 
contrast, the Office Action's covariance matrix of signals relates to the desired. It is 
unreasonable to say that they are the same thing, or even that one suggests the other. 

Applicant's note that the term "interference covariance matrix", a.k.a., 
"covariance matrix of interference", and the term "covariance matrix of signals" are terms 
of art. More specifically, the interference covariance matrix results from applying basic 
mathematics to interference encountered in a MIMO communications system. As such, 
"covariance matrix of signals", as used in the references, cannot be defined to include the 
limitation of "interference covariance matrix" as recited in applicants' claims. This can 
be clearly seen from the following basic explanation. 

Mathematically, typically, a random vector has a covariance matrix associated 
with it. The covariance matrix for a vector is a generalization of the concept of a variance 
for a random scalar. More specifically, the i,j entry of the covariance matrix indicates the 
cross-variance between the i th and the j th entries of the vector to which the covariance 
pertains. 

In MIMO communication systems, the signal, i.e., the desired information, and 
the interference, i.e., the other undesired stuff, i.e., besides the noise, that gets introduced 
by the channel and causes the signal to be degraded, are, naturally, separate items and 
each is represented by its own distinct vector. In a MIMO system, the signal is 
represented as a vector and the interference is represented by its own separate vector. 
That the signal is represented as a vector and the interference is also represented by a 
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vector is due to the multidimensional nature of the MIMO system. Naturally then, the 
signal vector and the interference vector each has associated with it its own respective 
covariance matrix. 

In other words, given that there is a signal vector, there is a covariance matrix for 
that signal vector, known as the signal covariance matrix. Also, given that there is an 
interference vector, there is a covariance matrix for the interference vector, known as the 
interference covariance matrix. 

Thus, notwithstanding the Office Action's suggestion to the contrary, the signal 
covariance matrix and the interference covariance matrix are different quantities and they 
are respectively computed based on different vectors. 

Clearly then, the two terms cannot be simply substituted for one another. One 
relates to the signal, which is good, the other to the interference, which is bad. 
Furthermore, use of one does not suggest use of the other. In fact, the cited references do 
not treat the interference as a vector, but instead, simplify its representation to be a scalar. 
Thus, the references, at best, have covariance matrix of signals, but they do not have an 
interference covariance matrix. 

Attention is directed to the paper, attached hereto, entitled MIMO Channel 
Capacity in Co-Channel Interference by Yi Song and Steven D. Blostein, in which 
equation 2 shows the covariance matrix of the interference-plus-noise and equation 3 
shows the covariance matrix of the received signal, thus clearly distinguishing between 
the two. Note that the recited term interference covariance matrix is the covariance 
matrix of the interference-plus-noise when the noise is negligible with respect to the 
interference. 

Other articles, attached hereto as well, which use the term interference covariance 
matrix as a term of art are, for example, Channel Estimation and Data Detection for 
MIMO Systems under Spatially and Temporally Colored Interference by Yi Song and 
Steven D. Blostein; Unique Features of Subspace Processing for Adaptive Radar in 
Inhomogeneous Environments by William L. Melvin; Interference Mitigation in STAP 
Using the Two-Dimensional Wold Decomposition Model by Joseph M. Francos; and 
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Subspace Approximation for Adaptive Multichannel Radar Filtering by A.W. Bojanczyk, 
W.L. Melvin, and E.J. Holder 

Furthermore, the use of an interference covariance matrix is not obvious from the 
use of a scalar to represent interference. Indeed, even if it were suggested, which it is not, 
making use of the interference covariance matrix is not simple calculation, nor is it a 
straightforward substitution of a matrix for a scalar. 

Thus, applicants use of the interference covariance matrix, and recitation of same 
in the claims, renders applicants' invention as claimed patentable over the cited 
references. 

The Office Action appears to be looking for some sort of explicit claim language 
that states that the interference-plus-noise being dealt with by applicants is not white. Not 
having found such language, the Office Action states that there is no such distinguishing 
limitation in the claims. However, applicants note that it is the use of the interference 
covariance matrix that allows the invention to be employed in an environment in which 
the interference is not white, and use of the interference covariance matrix is indeed 
recited in applicants' claims. Applicants in their previous response were simply pointing 
out a distinguishing advantage that results from the invention as claimed. 

Thus, all of applicants' claims are allowable over the cited references, individually 
or in combination. 
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Conclusion 

In view of the foregoing, it is submitted that the Examiner is in error. It is, 
accordingly, respectfully requested that the rejection of claims 1-4, 6-14, and 17-20 be 
reversed and the application passed to issue. 



Respectfully, 

G. J. Foschini 
A. Lozano 
F. Rashid-Farrokhi 
R. A. Valenzuela 




Reg. No. 36,658 
732-949-1857 

Lucent Technologies Inc. 
Date: 8 fa I I Q<f 
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APPENDIX A 

Claims 



1 LA method for transmitting signals in communications system having a 

2 transmitter with N transmit antennas transmitting over a forward channel to a receiver 

3 having L receiver antennas and a reverse channel for communicating from said receiver to 

4 said transmitter, in which there may exist correlation in the signals received by two or 

5 more of said L receive antennas, the method comprising the steps of: 

6 determining the number of independent signals that can be transmitted from said 

7 N transmit antennas to said L receive antennas; 

8 creating, from a data stream, a data substream to be transmitted for each of the 

9 number of independent signals that can be transmitted from said N transmit antennas to 

1 0 said L receive antennas; 

1 1 weighting each of said substreams with N weights, one weight for each of said N 

12 transmit antennas, said weights being determined by said transmitter as a function of 

13 channel information and an interference covariance matrix, to produce N weighted 

14 substreams per substream; 

15 combining one of said weighted substreams produced from each of said 

16 substreams for each of said transmit antennas to produce a transmit signal for each of said 

1 7 transmit antennas . 

1 2. The invention as defined in claim 1 further comprising the step of transmitting 

2 said transmit signal from a respective one of said antennas. 

1 3. The invention as defined in claim 1 further comprising the step of receiving 

2 said weights via said reverse channel. 

1 4. The invention as defined in claim 1 wherein said channel information and said 

2 interference covariance matrix are received by said transmitter from said receiver via said 

3 reverse channel. 

1 5. The invention as defined in claim 1 wherein said weights are determined by 

2 solving a matrix equation H T (K N )H = U T A 2 U where: 

3 H is a channel response matrix, 

4 H f is a conjugate transpose of said channel response matrix H, 

5 K N is the interference covariance matrix, 
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6 U is a unitary matrix, each column of which is an eigenvector of H 1 " (K N )H , 

7 A is a diagonal matrix defined as A = diag(X l , . . . , X M ) , where X ] , . . . , X M are each 

8 eignevalues of H f (K N )H, M being the maximum number of nonzero eigenvalues, which 

9 corresponds to the number of said independent signals, and 

10 U 1 " is the conjugate transpose of matrix U; 

11 waterfilling said eigenvalues X by solving the simultaneous equations 

12 X k = (v ^ T ) + and £ A k =P 9 for v , where: 

) k 

13 k is an integer index that ranges from 1 to M, 

14 P is the transmitted power, 

15 + is an operator that returns zero (0) when its argument is negative, and returns the 

16 argument itself when it is positive, and 

17 each X is an intermediate variable representative of a power for each weight 

1 8 vector; 

19 defining matrix Oas ® = \J*diag(X\... 9 X M )U, where diag indicates that the 

20 various X are arranged as the elements of the main diagonal of matrix O ; 

21 wherein each column of matrix <D is used as a normalized weight vector indicated 

22 by O = [z lv ..,z N ] and said normalized weight vectors are made up of individual 

23 normalized weights z, z ; = [z n 5 ...,z w ], where i is an integer ranging from 1 to N; 

24 developing an unnormalized weight vector w { =[w,.,,...,w, w ], with each of said 

25 weights therein being z i} , where j is an integer ranging froml to N. 
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1 6. Apparatus for transmitting signals in communications system having a 

2 transmitter with N transmit antennas transmitting over a forward channel to a receiver 

3 having L receiver antennas and a reverse channel for communicating from said receiver to 

4 said transmitter, in which there may exist correlation in the signals received by two or 

5 more of said L receive antennas, the apparatus comprising: 

6 means for determining the number of independent signals that can be transmitted 

7 from said N transmit antennas to said L receive antennas; 

8 means for creating, from a data stream, a data substream to be transmitted for each 

9 of the number of independent signals that can be transmitted from said N transmit 

1 0 antennas to said L receive antennas; 

1 1 means for weighting each of said substreams with N weights, one weight for each 

12 of said N transmit antennas, said weights being determined by said apparatus for 

13 transmitting signals as a function of information about said forward channel and an 

14 interference co variance matrix, to produce N weighted substreams per substream; 

15 means for combining one of said weighted substreams produced from each of said 

16 substreams for each of said antennas to produce a transmit signal for each antenna. 



1 7. The invention as defined in claim 6 wherein said transmitter comprises means 

2 for developing said weights. 

1 8. The invention as defined in claim 6 wherein said transmitter comprises means 

2 for storing said weights. 

1 9. The invention as defined in claim 6 wherein said receiver comprises means 

2 for developing said weights. 
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1 10. A transmitter for transmitting signals in communications system having a 

2 transmitter with N transmit antennas transmitting over a forward channel to a receiver 

3 having L receiver antennas and a reverse channel for communicating from said receiver to 

4 said transmitter, in which there may exist correlation in the signals received by two or 

5 more of said L receive antennas, the transmitter comprising: 

6 a demultiplexer for creating, from a data stream, a data substream to be 

7 transmitted for each of the number of independent signals that can be transmitted from 

8 said N transmit antennas to said L receive antennas 

9 multipliers for weighting each of said substreams with N weights, one weight for 

10 each of said N transmit antennas, wherein said weights are determined in said transmitter 

1 1 in response to an interference covariance matrix estimate and an estimate of the forward 

12 channel response, to produce N weighted substreams per substream, each of said weights 

13 being a function of at least an estimate interference covariance matrix and an estimate of 

14 a forward matrix channel response between said transmitter and said receiver; and 

15 adders for combining one of said weighted substreams produced from each of said 

16 substreams for each of said antennas to produce a transmit signal for each of said transmit 

17 antennas. 

1 11. The invention as defined in claim 10 further comprising a digital to analog 

2 converter for converting each of said combined weighted substreams. 

1 12. The invention as defined in claim 10 further comprising an upconverter for 

2 converting to radio frequencies each of said analog-converted combined weighted 

3 substreams. 

1 13. The invention as defined in claim 10 wherein said interference covariance 

2 matrix estimate and said estimate of the forward channel response are received by said 

3 transmitter from said receiver over said reverse channel. 

1 14. The invention as defined in claim 10 wherein said weights are determined in 

2 said receiver and are transmitted to said transmitter over said reverse channel. 

1 15. The invention as defined in claim 10 wherein said weights are determined by 

2 solving a matrix equation H t (K N )H = \J^A 2 \J where: 

3 H is a channel response matrix, 

4 H f is a conjugate transpose of said channel response matrix H, 
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5 K N is the interference covariance matrix, 

6 U is a unitary matrix, each column of which is an eigenvector of H ' (K N )H , 

7 A is a diagonal matrix defined as A = diag(X\,. . , A u ) , where I 1 ,. . . , A M are each 

8 eignevalues of H f (K N )H , M being the maximum number of nonzero eigenvalues, which 

9 corresponds to the number of said independent signals, and 

10 U t is the conjugate transpose of matrix U; 

11 waterfilling said eigenvalues X by solving the simultaneous equations 

12 X k = (v and £ X k = P , for v , where: 

) k 

13 A: is an integer index that ranges from 1 to M, 

14 Pis the transmitted power, 

15 + is an operator that returns zero (0) when its argument is negative, and returns the 

16 argument itself when it is positive, and 

17 each X is an intermediate variable representative of a power for each weight 

1 8 vector; 

19 defining matrix Oas 0 = U t J/ag(I 1 ,...,I M )U, where diag indicates that the 

20 various X are arranged as the elements of the main diagonal of matrix O ; 

21 wherein each column of matrix O is used as a normalized weight vector indicated 

22 by 0 = [z 1? ...,z N ] and said normalized weight vectors are made up of individual 

23 normalized weights z, z { = [z n ,. . .,z iN ] 9 where i is an integer ranging from 1 to N; 

24 developing unnormalized weight vector w { = [w,.,,... ,w iN ] , with each of said 

25 weights therein being \j~X! z {j , where j is an integer ranging from 1 to N. 

1 16. The invention as defined in claim 10 wherein said transmitter and receiver 

2 communicate using time division multiplexing (TDD) and said weights are determined in 

3 said transmitter using an estimate of the forward channel response that is determined by a 

4 receiver of said reverse link for said transmitter. 

1 1 7. A receiver for use in a MIMO system, comprising: 

2 L antennas; 

3 L downconverters; 

4 an estimator for determining an estimate of an interference covariance matrix for a 

5 forward channel being received by said receiver; and 

6 a transmitter for a reverse channel for transmitting said estimate of an interference 

7 covariance matrix to a receiver for said reverse channel. 
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1 1 8. A receiver for use in a MIMO system, comprising: 

2 L antennas; 

3 L downconverters; 

4 an estimator for determining an estimate of an interference covariance matrix for a 

5 forward channel being received by said receiver; 

6 an estimator for determining an estimate of a channel response for a forward 

7 channel being received by said receiver; and 

8 a transmitter for a reverse channel for transmitting said estimate of an interference 

9 covariance matrix and said estimate of a channel response to a receiver for said reverse 
10 channel. 

1 

1 1 9. A receiver for use in a MIMO system, comprising: 

2 an estimator for determining an estimate of an interference covariance matrix for a 

3 forward channel being received by said receiver; 

4 an estimator for determining an estimate of a channel response for a forward 

5 channel being received by said receiver; and 

6 a weight calculator for calculating weights for use by a transmitter of said forward 

7 channel to transmit data substreams to said receiver as a function of said estimate of an 

8 interference covariance matrix for a forward channel being received by said receiver and 

9 said estimate of a channel response for a forward channel being received by said receiver. 

1 20. The invention as defined in claim 19 further including a transmitter for a 

2 reverse channel for transmitting said weights to a receiver for said reverse channel. 

1 2 1 . A receiver for use in a MIMO system, comprising: 

2 L antennas; 

3 L downconverters; 

4 an estimator for determining an estimate of an interference covariance matrix for a 

5 forward channel being received by said receiver; 

6 an estimator for determining an estimate of a channel response for a forward 

7 channel being received by said receiver; and 

8 a weight calculator for calculating weights for use by a transmitter of said forward 

9 channel to transmit data substreams to said receiver, said weights being determined in 

1 0 said weight calculator by 

1 1 solving a matrix equation H f (K N )H = U f A 2 U where: 

12 H is a channel response matrix, 
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13 is a conjugate transpose of said channel response matrix H, 

14 K N is the interference covariance matrix, 

15 U is a unitary matrix, each column of which is an eigenvector of H t (K N )H , 

16 A is a diagonal matrix defined as A = diag(/C , . . . , A M ) , where A ] , . . . , A M are each 

17 eignevalues of H^K^H , M being the maximum number of nonzero eigenvalues, which 

18 corresponds to the number of said independent signals, and 

19 U f is the conjugate transpose of matrix U; 

20 waterfilling said eigenvalues X by solving the simultaneous equations 

21 X k =(v l-jY and £ X k = P , for v , where: 

(A ) k 

22 k is an integer index that ranges from 1 to M, 

23 P is the transmitted power, 

24 + is an operator that returns zero (0) when its argument is negative, and returns the 

25 argument itself when it is positive, and 

26 each X is an intermediate variable representative of a power for each weight 

27 vector; 

28 defining matrix Oas <D = U t rf/ag(I , ,... 5 A M )U, where diag indicates that the 

29 various X are arranged as the elements of the main diagonal of matrix <D ; 

30 wherein each column of matrix <J> is used as a normalized weight vector indicated 

31 by 0 = [z 1? ...,z N ] and said normalized weight vectors are made up of individual 

32 normalized weights z, z { = [z n ,. . . , z iN ] , where i is an integer ranging from 1 to N; 

33 developing unnormalized weight vector Wj =[w n ,..., w iN ] , with each of said 

34 weights therein being 4X l z y , where j is an integer ranging froml to N. 



1 22. A method for determining weights for use in transmitting signals in 

2 communications system having a transmitter with N transmit antennas transmitting over a 

3 forward channel to a receiver having L receiver antennas and a reverse channel for 

4 communicating from said receiver to said transmitter, in which there may exist 

5 correlation in the signals received by two or more of said L receive antennas, the method 

6 comprising the steps of: 

7 determining the number of independent signals M that can be transmitted from 

8 said N transmit antennas to said L receive antennas through a process of determining 

9 weights for substreams derived from data to be transmitted via said N antennas as part of 

10 forming said signals, wherein said weights are determined by 

1 1 solving a matrix equation H ? (K N )H = U f A 2 U where: 



D:\PATENTS\Foschini 10-l-3-14\Foschini 10-1-3-14appeal.doc 15 



Serial No. 08/866,754 



12 H is a channel response matrix, 

13 is a conjugate transpose of said channel response matrix H, 

14 K N is the interference co variance matrix, 

15 U is a unitary matrix, each column of which is an eigenvector of H f (K N )H , 

16 A is a diagonal matrix defined as A = diag(X x , . . . ? Z M ) , where A 1 , . . . , l M are each 

17 eigenvalues of H^K^H , M being the maximum number of nonzero eigenvalues, which 

1 8 corresponds to the number of said independent signals, and 

19 U f is the conjugate transpose of matrix U; 

20 waterfilling said eigenvalues X by solving the simultaneous equations 

21 l k = (v l —Y and £ l k = P, for v, where: 

) k 

22 & is an integer index that ranges from 1 to M, 

23 P is the transmitted power, 

24 + is an operator that returns zero (0) when its argument is negative, and returns the 

25 argument itself when it is positive, and 

26 each A is an intermediate variable representative of a power for each weight 

27 vector; 

28 defining matrix (Das ® = UV/ag(l 1 ,...,I M )U, where diag indicates that the 

29 various X are arranged as the elements of the main diagonal of matrix O ; 

30 wherein each column of matrix O is used as a normalized weight vector indicated 

31 by <D = [z p ...,z N ] and said normalized weight vectors are made up of individual 

32 normalized weights z, z { = [z n , . . . , z m ] , where / is an integer ranging from 1 to N ; 

33 developing unnormalized weight vector Wj =[w /13 ... 3 w w ] , with each of said 

34 weights therein being VX^" z fj , where j is an integer ranging froml to N. 
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MIMO Channel Capacity in Co-Channel Interference 

Yi Song and Steven D. Blostein 
Department of Electrical and Computer Engineering 
Queen's University 
Kingston, Ontario, Canada, K7L 3N6 
E-mail: {songy, sdb}@ee.queensu.ca 



Abstract — Recent information theory results have indi- 
cated that a large channel capacity exists for wireless sys- 
tems with multiple transmit and receive antennas. With 
different assumptions of channel knowledge and interference 
knowledge at the transmitter, the channel capacity of mul- 
tiple input multiple output (MIMO) systems has been stud- 
ied under both spatially white and colored interference and 
noise. In this paper, we fix the total interference-plus-noise 
power and evaluate the outage capacity under two spatially 
colored interference environments: (1) a few high-data-rate 
interferers each with high power, (2) a large number of low- 
data-rate interferers each with low power. The results show 
that MIMO capacity is larger with fewer high-data- rate in- 
terferers. We also assess the impact of an estimated channel 
and/or interference on capacity. In the case of 4 transmit 
and 4 receive antennas for the user of interest, 10 interferers, 
total-interference-to-noise ratio and signal-to-noise ratio are 
both 20dB, the results show that it is beneficial to estimate 
the channel and/or interference if the variance of estimation 
error is less than about 50% of the variance of true channel 
and/or interference. 

I. Introduction 

Recent information theory results have indicated that 
a large channel capacity exists for wireless systems with 
multiple transmit and receive antennas [1]. With different 
assumptions of channel knowledge and interference knowl- 
edge at the transmitter, the channel capacity of multiple 
input multiple output (MIMO) systems have been studied 
under both spatially white and colored interference and 
noise by applying different power allocation schemes at the 
transmitter [2] [3]. Meanwhile, in future generation wire- 
less communication systems, multi-rate data services will 
be dominant. To support users of different data rate at a 
certain quality of service (e.g., a certain level of bit error 
~rate)7the user's transmit power-isrin-general—proportional- 
to the data rate. Therefore, high-data-rate users need high 
transmit powers. 

In multiple-access systems, the interference is, in gen- 
eral, spatially colored. In this paper, we fix the total 
interference-plus-noise power and examine the MIMO out- 
age capacity under two spatially colored interference en- 
vironments: (1) a few high-data-rate interferers each with 
high power, (2) a large number of low-data-rate interfer- 
ers each with low power. We would like to find out under 
which interference environment a MIMO system achieves 
a higher outage capacity. The assumption of fixed total 
interference- plus- noise power is reasonable since in wire- 
less systems, likely there is some mechanism, such as power 
control, to control the interference experienced by a user. 
The result of this work has the implication on design of 



the medium access control (MAC) protocols and schedul- 
ing of packet transmissions in future wireless systems. We 
will also assess the impact of an estimated channel and/or 
interference on capacity. 

II. System Model 

We consider a single- user narrowband link with interfer- 
ence from other users. The user of interest is equipped with 
M transmit and TV receive antennas. Each interfering user 
has one transmit antenna, and the same N receive anten- 
nas as the user of interest. The received signal vector y 
(N x 1) is 

y = Hs + y^^h i .s i +w (1) 

N ^ ' 

II 

where H (N x M) is the MIMO channel matrix of the 
user of interest, s (M x 1) is the transmit signal vector 
of the user of interest, n (N x 1) is the interference- plus- 
noise vector at the receiver. The number of interferers is 
L, Pj is the fixed total interference power, hi (N x 1) is 
the channel vector of the ith interferes Si is the ith inter- 
fered transmit signal with unit power, and w (N x 1) is 
the thermal noise with covariance matrix E{ww*} = <7 2 1n 
where ] denotes transpose conjugate. The channel matrix 
H and the channel vectors h 7 ;'s are mutually independent, 
and assumed to be q nasi -static (constant over one frame) 
having uncorrected realizations in different frames. It is 
further assumed that the elements in H and are iden- 
tically independent distributed (i.i.d.) complex Gaussian 
.randomjvaxiables^I^J-with^ 

This implies flat Rayleigh fading and that antennas are 
separated far apart. The signal of the user of interest s, 
the interfering signal s 7 ., and the thermal noise w are mu- 
tually independent. It is obvious in (1) that each interferer 
has the same power. More interferers in the system, lower 
power each interferer has. It can be shown that the covari- 
ance matrix of the interference- plus-noise is 

P L 

R 0 = Ejrm 1 } = j~ ^ hji] + ^N, (2) 

7 = 1 

and the covariance matrix of the received signal is 

E{yy t } = HE w H t +R 0 (3) 
where E s = E{ss*}. 



In our system model, we assume each interferer has one 
transmit antenna. However, it is easy to accommodate 
interferers with more than one transmit antenna by aggre- 
gating several interfering users with one transmit antenna. 

III. Channel Capacity 

In this section, we derive the MIMO channel capacity 
with spatially colored interference and under different as- 
sumptions of channel and interference knowledge at the 
transmitter: (1) both channel and interference covariance 
matrices H and Ro are available, (2) only H is available, 
and (3) neither H nor Rq is available. In all the cases, we 
assume that the receiver knows the channel H. Comparing 
to [3], our derivation uses a modeled interference covariance 
matrix as (2). In addition, we give a new interpretation of 
MIMO channel capacity under spatially colored interfer- 
ence. 

We introduce the differential entropy of a circularly 
symmetric complex Gaussian random vector. If x is 
a circularly symmetric complex Gaussian random vector 
with covariance matrix Q, the differential entropy of x is 
log 2 det(7reQ). In addition, circularly symmetric complex 
Gaussians are entropy maximizers [4]. 

Assuming the interference- plus- noise n in (1) is circularly 
symmetric complex Gaussian, the optimal distribution for 
the signal s is then circularly symmetric complex Gaussian 
[4] [5]. As the receiver knows the channel, the mutual in- 
formation between the channel input and output is given 



as 



X(s;y) 

= log 2 det [ 7 re(HE s H t + Ro)] - log 2 det(7reR 0 ) 
= log 2 det (l N + R^H^f H 1 ) 

I w + (R- 1 / 2 H)^(R- 1 / 2 H) t ] 



(4) 



= log 2 det 
where 



R= X7E h ' h . t+I «- 



j 2 L 



(5) 



equivalent to the capacity of the combined channel R _1 / 2 H 
under spatially white noise. With this new interpretation 
and the results of channel capacity under spatially white 
noise in [2], we obtain the capacity with spatially colored 
interference and noise. 

A. Both channel and interference information at the trans- 
mitter 

By applying water- filling power allocation with the com- 
bined channel R~ 1//2 H at the transmitter, the channel ca- 
pacity is 



M 



C = £]log 2 (l 



(7) 



the optimal transmit signal covariance matrix is 

E s = a 2 Udiag(p!,... ,p M )U* (8) 

where 

H t R" 1 H = UAU t , A = diag(Ai,... ,A M ), (9) 

Ai, . . . , Am are the eigenvalues of H^R _1 H, U is an uni- 
tary matrix consisting of eigenvectors of H^R _1 H, 



where \x is chosen such that 



M 



T 



(10) 



(11) 



and (x) + denotes the larger of 0 and x. 

B. Neither channel nor interference information at the 
transmitter 

If the transmitter applies uniform power allocation 
across the transmit antennas, i.e., E s — (Pt/M)Im, the 
capacity is given by 



C — log 2 det I In -f 



and the third equality comes from the facts that det (I -f 
AB) — det(I+ZL4) for square matrices A and B and R" 1 / 2 
is Hermitian. We denote A- as the ratio of total interfer- 
ence power to noise power. The channel capacity is the 
maximized mutual information with transmit power con- 
straint tr(E 5 ) < Pp, i.e., 



M ■ a 2 



C = max log 2 det 



C. Only channel information at the tmnsmitter 

It is claimed in [3] that the optimal power allocation 
is the water-filling using H and assuming interference co- 
variance matrix to be an identity matrix, i.e., the optimal 
transmit signal covariance matrix E. s is obtained from (8)- 
^ (11) by setting R = 1^, and the capacity is obtained by 

I N + (r~ 1/2 h) -f (r~ 1/2 h) , substituting the resultant E s into (4). However, no justifi- 
a J cation that this scheme is optimal was given in [3]. At the 

same time, if we consider R" 1/2 H as a combined channel, 
without knowing R, we do not know this combined chan- 
nel. As a result, uniform power allocation at the trans- 
mitter should be used, i.e., the capacity is as (12). It is 
not obvious which power alio cat ion scheme gives a higher 



(6) 



where 3f is the ratio of signal power to noise power. 

Eqn. (6) suggests that we could consider R _1 / 2 H as 
a combined channel. As a result, the capacity in (6) is 



capacity, uniform power allocation (Section III-B) or water- 
filling (Section III- A) using K = l N Uniform power allo- 
cation does not use the known channel information, while 
the water-filling scheme uses the incorrect interference in- 
formation. We simulated 10,000 sets of H and R, and in 
all cases water-filling scheme using H only gives a higher 
capacity than uniform power allocation. However, no proof 
that this is true in general has been found. 

IV. CHANNEL CAPACITY WITH ESTIMATED CHANNEL 
AND INTERFERENCE 

When the transmitter is provided with estimates of chan- 
nel and/or interference covariance matrix, we can calculate 
the capacity by applying water-filling as in Section III-A 
using estimated interference covariance and channel ma- 
trices R and H, respectively. As a result, we are able to 
evaluate the degradation of capacity due to estimation er- 
ror of channel and interference covariance matrices. 

We model the estimate of H as 

H = H + E H (13) 

where H is the true channel. The elements in the estima- 
tion error matrix, E#, are i.i.d. zero-mean complex Gaus- 
sian. This implies that the estimation errors of channel 
are mutually independent. We assume that the variance 
of estimation error is proportional to the variance of true 
channel. Therefore, the variance of the (ij)th element of 
Eh, is specified by 

VAR(E^) = tm • VAR(Htf) (14) 

where /z# is the parameter that controls the quality of the 
estimate. As the (i,j)th element in H, H i:7 , is complex 
Gaussian with unit variance, VAR(Hfj) = 1. 
Similarly, we model the estimate of R as 

R = R + E R (15) 

where R is the true interference covariance matrix. We 
restrict the estimation error matrix E/? to be Hermitian. 
We assume that the elements in the lower triangle of E/j, 
"Eji.tj for i-< j, are mutuallylndependent. The elements 
Ej^ij for i < j are i.i.d. complex Gaussian, while the 
diagonal elements of E R are i.i.d. real Gaussian. Again, 
the variance of E^- is specified by 

VAR(Eh^) - fiR • VAR(Rtf). (16) 

The variance of the diagonal elements in R can be calcu- 
lated as 

VAR(R^) = U^j) 2 £ VAR(hiX) ( 17 ) 

^ ' 1=1 

where hij is the jth element in vector hi. Since is 
zero-mean complex Gaussian with unit variance, h^h^- 
is chi-square distributed with 2 degree of freedom, and 



VAR(hyhf-) = 1. As hij's are i.i.d. for all i and j, we 
have 

VAR(Rtf) = \. (18) 

The variance of off-diagonal elements in R is 

2 L 

VAR(R, li2 ) = (-^y) Y,VkK{h ih h\ h )- (19) 

^ ' i=l 

Let h ijl = a\+jb u h,; 72 = a2+j&2, and a-i, a 2 ,&i and &2 are 
i.i.d. zero-mean complex Gaussian with unit variance. It 
can be shown that £(h i?1 hj i2 ) = 0 and VAR^^h^) = 1. 

With specified \in and \xr % wc arc able to simulate esti- 
mated channel and interference covariance matrices H and 
R, respectively. The optimal transmit signal covariance 
matrix E s is found by applying water-filling, i.e., (S)-(ll) 
with estimates H and/or R. The capacity is then obtained 
by substituting the resultant E. s into (4) . 

V. Simulation Results 

We calculate the capacity under different assumptions of 
knowledge of channel and interference at the transmitter. 
For the case of only channel information at the transmitter, 
we use (8)-(ll) and set R = In to obtain the capacity. As 
H and R are random matrices, the capacity is treated as 
a random variable. The performance measurement here is 
the 10% outage capacity, Cb.i, where P(C < Cb.i) = 10%. 
Monte Carlo simulation is used to obtain the 10% outage 
capacity. 

In Fig. 1, we fix the total interference power and evaluate 
the outage capacity as the number of interferers increases. 
The user of interest is assumed to have 4 transmit and 4 
receive antennas. The ratio of signal power to noise power 
and the ratio of total interference power to noise power 
are both 20dB. We find that the 10% outage capacity de- 
creases significantly as the number of interferers increases. 
When the channel and interference are not known at the 
transmitter, the capacity with one interferer is 16 bps/Hz. 
This number is reduced sharply to 3 bps/Hz with 10 in- 
^rferer^eaclr-with-oii^^ implies-t-ha-t- 
MIMO systems perform more efficiently where there are a 
few strong interferers. 

In Fig. 2, we fix the number of interferers to be 4, and 
examine the outage capacity as wc increase the ratio of to- 
tal interference power to noise power. Again, the user of 
interest is assumed to have 4 transmit and 4 receive an- 
tennas. The ratio of signal power to noise power is 15dB. 
We .observe that when the total interference power is low, 
knowing only the channel allows us to achieve about the 
same capacity as that in the case of full knowledge of chan- 
nel and interference. However, when the total interference 
power is high, without interference information, knowing 
only the channel leads to about the same capacity as that 
in the case of no channel and interference knowledge at the 
transmitter. 



In Fig. 3, assuming the user of interest has the same 
number of transmit and receive antennas, we calculate the 
outage capacity as the number of transmit antennas in- 
creases. We fix the number* of interferers to be 4, the ratio 
of signal power to noise power and the ratio of total inter- 
ference power to noise power are both 20dB. We observe 
that the capacity increases almost linearly as the number 
of antennas. In addition, the differences in capacity under 
different knowledge of channel and interference increase as 
the number of antennas increases. 

In Fig. 4, we assess the degradation of channel capacity 
using estimated channel and/or interference. The user of 
interest is assumed to have 4 transmit and 4 receive anten- 
nas, and the ratio of signal power to noise power and the 
ratio of total interference power to noise power are both 
20dB. The number of interferers is 10. In the case of no 
knowledge of channel and interference at the transmitter, 
we compare the capacity of uniform power allocation to 
that of water-filling using estimated channel and interfer- 
ence. We observe that, for ixh = ^r, when fin and fiR are 
less than 50%, water-filling using estimated channel and 
interference achieves a higher capacity than uniform power 
allocation. In the case of only channel information at the 
transmitter, we compare the capacity of water-filling us- 
ing known channel and estimated interference to that of 
water-filling using channel only. Again, for \xh = fiR, we 
observe that when fi R is less than 50%, water-filling using 
known channel information and estimate of interference co- 
variance matrix is better than water-filling using channel 
information only. Fig. 4 also shows the degradation of ca- 
pacity due to estimation error of channel and interference 
for cases of jin = 0-1 V>R and \iu = 10^#. 




number of interferers 



Fig. 1. 10% outage capacity versus number of interferers. The user of 
interest has 4 transmit and 4 receive antennas, the ratio of signal 
power to noise power is 20dB, and the ratio of total interference 
power to noise power is 20dB. 



VI. Conclusions 

In this paper, we fixed the total interference-plus-noise 
power and examined MIMO outage capacity under differ- 
ent interference environments: (1) a few high-data-rate in- 
terferers each with high power, (2) a large number of low- 
data-rate interferers each with low power. The results show 
that MIMO capacity is larger with fewer high-data-rate 
interferers. We also evaluated the degradation of outage 



capacity using estlrn^teil~ch^nnel~an"dywnnterfereTiT:e: 
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Fig. 2. 10% outage capacity versus the ratio of total interference 
power to noise power. The user of interest has 4 transmit and 4 
receive antennas, the ratio of signal power to noise power is 15dB, 
and the number of interferers is 4. 
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Fig. 3. 10% outage capacity versus number of antennas, assuming 
the user of interest has the same number of transmit and receive 
antennas. The number of interferers is 4, the ratio of signal power 
to noise power is 20dB, and the ratio of total interference power 
to noise power is 20dB. 




Fig. 4. 10% outage capacity versus ixr. The user of interest has 4 
transmit and 4 receive antennas, the number of interferers is 10, 
the ratio of signal power to noise power is 20dB, and the ratio of 
total interference power to noise power is 20dB. 
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Abstract 



The impact of interference on multiple-input multiple-output (MIMO) systems has attracted recent 
interest. Most studies of channel estimation and data detection for MIMO systems consider spatially and 
temporally white interference at the receiver. In this paper, we address channel estimation, interference 
correlation estimation and data detection for MIMO systems under both spatially and temporally col- 
ored interference. We examine the case of one dominant interferer in which the data rate of the desired 
user could be the same or a multiple of that of the interferer. Assuming known temporal interference 
correlation as a benchmark, we derive maximum-likelihood estimates of the channel matrix and spatial 
interference correlation matrix, and apply these estimates to a generalized version of the BLAST (Bell 
Labs Layered Space-Time) ordered data detection algorithm. We then investigate the performance loss 
by not exploiting interference correlation. For a (5, 5) MIMO system undergoing independent Rayleigh 
fading, we observe that exploiting both spatial and temporal interference correlation in channel estima- 
tion and data detection results in potential gains of 1 .5dB and 4dB for an interferer operating at the same 
data rate and at half the data rate, respectively. Ignoring temporal correlation, it is found that spatial 
correlation accounts for about 1 db of this gain. 

Index Terms 

Multiple-input multiple-output, Interference, Channel estimation, Data detection 



Wireless systems with multiple transmitting and receiving antennas have been shown to have 
a large Shannon channel capacity in a rich scattering environment [1][2]. By transmitting par- 




Shannon capacity of the MIMO channel increases significantly with the number of transmit- 
ting and receiving antennas [2]. Layered space-time architectures were proposed for high-rate 
transmission in [3] and [4]. Space-time coding techniques have also been investigated [5][6]. 

While substantial research efforts have focussed on point-to-point MIMO link performance, 
the impact of interference on MIMO systems has received less interest. In a cellular environ- 
ment, co-channel interference (CCI) from other cells exists due to channel reuse. In [7], channel 
capacities in the presence of spatially colored interference were derived under different assump- 



I. Introduction 
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tions of knowledge of the channel matrix and interference statistics at the transmitter. The impact 
of spatially colored interference on MIMO channel capacity was studied in [8][9][I0]. The ca- 
pacity of MIMO systems with interference in the limiting case of a large number of antennas 
was studied in [11]. The overall capacity of a group of users each employing a MIMO link 
was investigated in [12]. The output signal-to-interference-power ratio (SIR) was analytically 
calculated in [13] when a single data stream is transmitted over independent Rayleigh MIMO 
channels. While the majority of the studies deal with channel capacity, in this paper we will 
focus on the achievable symbol error rate performance of a MIMO link with interference. 

Prior results on estimation of vector channels and spatial interference statistics for CDMA 
(code division multiple access) single-input multiple-output systems can be found in [14]. Most 
studies of channel estimation and data detection for MIMO systems assume spatially and tem- 
porally white interference. For example, in [15], maximum-likelihood (ML) estimation of the 
channel matrix using training sequences was presented assuming temporally white interference. 
Assuming perfect knowledge of the channel matrix at the receiver, ordered zero-forcing (ZF) and 
minimum mean-squared error (MMSE) detection were studied for both spatially and temporally 
white interference in [4] and [16], respectively. However, in cellular systems, the interference 
is, in general, both spatially and temporally colored. 

In this paper, we propose and study a new algorithm that jointly estimates the channel matrix 
and the spatial interference correlation matrix in a maximum-likelihood framework. We develop 
a multi-vector-symbol MMSE data detector that exploits interference correlation. In the case 
of a single dominant interferer and large signal-to-noise ratio (SNR), we show that spatial and 
temporal second-order interference statistics can be decoupled in the form of a matrix Kronecker 
product. In finite SNR, the decoupling of spatial and temporal statistics of interference-plus- 
noise is only an approximation. We also determine the conditions where this approximation 
breaks down. 

Although temporal interference correlation is difficult to estimate in practice, our objectives 
are to determine the performance benchmark achieved if temporal correlation were known. As 
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sources of temporal correlation, we consider cases in which the data rate of the desired user 
is either the same as or a multiple of that of the interferer. The new ML algorithm serves as a 
performance benchmark when temporal and spatial interference correlation are exploited in joint 
channel estimation and data detection. We also assess the performance improvement obtained 
in more practical cases where only part of the correlation information is exploited, including 
the performance obtained by assuming temporally white interference, i.e., ignoring temporal 
correlation. 

The paper is organized as follows. In Section II, we present our system model of temporal 
and spatial interference. In Section III, we derive ML estimates of channel and spatial inter- 
ference correlation matrices assuming known temporal interference correlation. In Section IV, 
one-vector-symbol detection is extended to a multi-vector-symbol version which is used to ex- 
ploit temporal interference correlation. In Section V, we consider the case of one interferer and 
large SNR and assess the benefits of taking temporal and/or spatial interference correlation into 
account for channel estimation and data detection. We then examine the level of SNR at which 
the approximation of separate spatial and temporal interference-plus-noise statistics break down. 
In cases where the spatial and temporal correlation are not separable, the performance improve- 
ment obtained by exploiting the spatial correlation is evaluated. For reference, comparisons are 
made to the well-known Direct Matrix Inversion (DMI) algorithm [25], generalized to multiple 
input signals, a batch method that does not require estimates of channel and spatial interference 
correlation matrices. 

In the paper, notation refers to transpose, (•)* refers to conjugate, (■)* refers to conjugate 
transpose, and I N refers to an N x N identity matrix. 

II. System Model 

We consider a single-user link consisting of N t transmitting and N r receiving antennas, de- 
noted as (Nu N r ). The desired user transmits data frame by frame. Each frame has.M data 
vectors. The first TV data vectors are used for training so that the desired user's channel matrix 
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and interference statistics can be estimated, and the remaining data vectors are for information 
transmission. In a slow flat fading environment, the received signal vector at time j is expressed 
as 

y i = Hx j + rij, j = 0, • ■ • , M - 1 (1) 

where Xj is the transmitted data vector, H is the N r x N t spatial channel gain matrix, and in- 
terference vector rij is zero-mean circularly symmetric complex Gaussian. We assume channel 
matrix H is fixed during one frame. This is a reasonable assumption since high speed data ser- 
vices envisioned for MIMO systems are generally intended for low mobility users. By the same 
argument, it is also assumed that the interference statistics are fixed during one frame. 

In practice, the interference may be both spatially and temporally correlated. We assume that 
the cross-correlation between the interference vectors at time i and j is E jn^nj j = A M (i, j)R 
where A M {i, j) is the (z, j)th element of an M x M matrix A M - The (z, j )th element of matrix R 
is the correlation between the zth and jth elements of interference vector n k , k G 0, • • * , M — 1. 
As a result, the covariance matrix of the concatenated interference vector n = [n^ • ■ * n^^] 7 
is 

A M (0,0)# ••■ A M (0,Af-l)jR 



E{nn ] } 



U = 



= A M <8>R (2) 



A M {M — 1, 0)R A M (M — 1, M — 1)R 
where ® denotes Kronecker product, and matrices A M and jR capture the temporal and spa- 
tial correlation of the interference, respectively. The above model implies that the spatial and 
temporal interference statistics are separable. The correlation matrices Am and R are deter- 
mined by the application-specific signal model. In Section V, we provide an example in which 
the interference covariance matrix has the above Kronecker product form. When the interfer- 
ence statistics can only be approximated by (2), the conditions where this approximation breaks 
down are investigated in Section V-D.3. In addition to interference correlation, we remark that 
a decoupled temporal and spatial correlation structure arises in the statistics of fading vector 
channels consisting of a mobile with one antenna and a base station with an antenna array [17]. 



III. Joint Estimation of Channel and Spatial Interference Statistics 
During a training period of N vector symbols, we concatenate the received signal vectors, the 
training signal vectors and the interference vectors as y = [y^ • • • r , x — [x% • • • x^_j] r 
and ft = [tZq - ■ • ^_i] r , respectively. The received signal in (1) is rewritten as the vector 

y = (I N ® H)x + ft 

where ft is circularly symmetric complex Gaussian with zero-mean and covariance matrix k N ® 
R. Assuming prior knowledge of temporal interference correlation matrix Ayy, we need to 
estimate channel matrix H and spatial interference correlation matrix R. If R and A^v are 
nonsingular, the conditional probability density function (PDF) 

1 



P*(y\H,R) 



tt n - n ' det(A N ® R) 



exp |- [y - (I N <g> H)x]^ {A N ® R)' 1 [y - (I N ® H)x}} 

(3) 



A. Maximum-likelihood solution 

The ML estimate of the pair of matrices (ff , R) is the value of (H . i?) that maximizes the 
conditional PDF in (3), which is equivalent to maximizing In Pr(y\H, R). 

Letting A and B denote m x m and n x n square matrices, and using identities [18] 

det( A ® S) = det( A) n det(B) m 



and 



J?) 1 = 4 A and B are nonsingular. 



it can be shown that maximizing (3) is equivalent to minimizing 

/(JET, R) = lndet(Jl) + [y - (I N ® H')x] t (A^ 1 ® Jl" 1 ) [y - (I* ® fT)x] 



(4) 



Denoting the elements of A N as 



A* 1 



QJo,o 



<^0,7V-1 



(5) 
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we rewrite (4) as 

- N-l N-l 

f(H t R) = indet(R) + -Y,Y, a ^(yi- Hx J R ~ 1 (yi- Hx >) 

N-l N-l 

N 



N 

i=0 j=0 



= lndet(H) + trace \ IT 1 ^ ^ £ a id ( Vi - HxJ (y :j - Hxrf I . (6) 

i=0 .7=0 J 

To find the value of (H,R) that minimizes f(H y R) in (6), we set df{H i R)/dH = 0. 
Define the weighted sample correlation matrices' as 

^ N-l N-l 

i=0 j=0 
N-l N-l 

i=0 .7=0 

and 

N-l N~l 



i=0 .7=0 

Using identities of matrix derivative [18], it can be shown [19] that (6) is minimized by 



H = R[ y R x l (10) 

Setting df(H, R) /OR = 0, it can also be shown that the estimate of spatial interference corre- 
lation matrix is given by 

N-l N-l 



R = jjY, T, °ij (Vi ~ Hx % ) (y 3 - Hxrf (11)_ 



i=o j=o 

= Ryy — HR X y. (12) 

We remark that if R xy and R xx in (10) were instead known cross- and auto-correlation matrices, 
the estimate for H would represent the Wiener solution. 

distinguish weighted sample correlation matrices from conventional sample correlation matrices in Section TII-B, we 
denote the former by a tilde and the latter without a tilde. 
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B. Special case: temporally white interference 

If interference is temporally white, with loss of generality, we may substitute A N = I N into 
(7)-(12), and obtain estimates 

H w = Rl y R~2> (13) 

and 

Rw — &yy ~ HyjRxy (14) 

where the subscript w indicates temporally white interference, and the sample correlation matri- 
ces 

jv-i 

«yy = ^E^ ! ' (15) 
R *y = JjYl x ^ (16) 

and 



Rxx = jjJ2 XiX *- (17) 

i=0 

Note that H w in (13) is the same as the channel estimate used in [15]. 



C Whitening filter interpretation 

To obtain insight on the estimates in (10) and (12), we let the received signal vectors during the 
training period undergo a linear transformation where the transformed received signal vectors 
-are 

[y'o • • ■ v'n-i] = [2/ 0 --yyv-i] A N 1/2 - 

At the output of the transformation, we have 

i/h^; + n;, i = o,...,w-i, 

where the transformed training signal vectors and interference vectors are 

bo • • • x n~i) = bo • • ■ x N ^]A^ 1/2 and [n' 0 . . . n' N _ r ] = [n 0 . . . 77, A ,_,]A~ 1/2 , 



respectively. Concatenating the transformed interference vectors as ft! — [11!^ . . . it can 

be shown that 

ft 1 = (A^ 1/2 ® I Nr )ri 

where ft = [uq . . . rt^_^ r . Since the covariance matrix of n is ® i£, the covariance matrix 
of n' is 

cov(n') = (A~ 1/2 ® I 7Vr )cov(n)(A- 1/2 ® I A ,J'» 

- (A' 1/2 ® /„ p ) (A* ® H) (A~ 1/2 ® I Nr ) 

= I N ®R (18) 

where we used (A ® J B) t = A f ® J3 1 and (A ® B)(C ® D) - AC ® BD [18]. We also 
used the fact that temporal correlation matrix A^ is symmetric, as well as A N . From (18), it 
is obvious that the transformed interference vectors {n f 0 . . . n' N _ , } are temporally white with 
spatial correlation matrix R. 

As a result, we can estimate H and R from the sample correlation matrices of transformed 
signal vectors as in Section III-B. The sample correlation matrix 

1 N ~ l 

**** = jjYl'tiVi 

= ^[^•••y , N-i][3/o---yjv-i] t 

= jj[yo • • • yN-i\ A N 1/2A N /2 \yo - • ■ '//N-il 1 

= JjlVo-" 2/tv-i] A n[2/o • • • Vn-i} ] = -^77:7 > 

which shows that the weighted sample correlation matrix of {y 0 . . . 2/jv-i} is equivalent to the 
sample correlation matrix of {j/d . . . } ■ Similarly, the weighted sample correlation matrices 
R xy and R xx are equivalent to the sample correlation matrices R tR > y t and R X > T <, respectively. 
Therefore, the estimates in (10) and (12) can also be realized by first temporally whitening 
the interference, and then forming the estimates from the sample correlation matrices of the 
transformed signal vectors. 
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IV. Data Detection 

We focus on ordered MMSE detection due to the better performance of MMSE compared to 
ZF detection [20]. For received signal vector y i = Hx, + n i9 modifying the BLAST algo- 
rithm in [16], the steps of ordered MMSE detection of cc 7 ; from y- with estimated channel and 
interference spatial correlation matrices are as follows: 
Step 1 Initialization: set k = 1, H k = H, x k = y k = 

Step 2 Calculate the estimation error covariance matrix P k = (I Ni +\^ k + H]jFt 'iffc)" 1 . Find 

m = argminPfc(j, j) where P k {jJ) denotes the 7th diagonal element of P k . Hence, 

3 

the rath signal component of x k has the smallest estimation error variance. 
Step 3 Calculate the weighting matrix A k = (lN t +i-k + H\R 1 H k )~ l H\R \ The mth ele- 
ment of Xk is estimated by x™ — Q (A fc (m, :) y k ) where A /;: (m. :) denotes the rath row 
of matrix A k and Q(-) denotes the slicing operation appropriate to the signal constella- 
tion. 

Step 4 Assuming the detected signal is correct, remove the detected signal from the received 
signal, y k+ i = y k — ra) where £*"&(:, ra) denotes the mth column of H k . 

Step 5 JTfc-fi is obtained by eliminating the rath column of matrix H k . x^+i is obtained by 
eliminating the mth component of vector x k . 

Step 6lfk<N u increment k and go to Step 2. 

We refer this scheme as one-vector-symbol detection as we detect x % using y i only. 
— Wh en int erference is temporally colored, there may be performance to be gained by taking 
the temporal interference correlation into account. That is, we may use j/yv+i> • • • > Vm t0 detect 
x N+u . . . , x M jointly where TV is the training length and M is the frame length. Due to the 
complexity of using all the received signal vectors and for simplicity of presentation, we consider 
two-vector-symbol detection in which {y^y^i) is used to detect (r^. x w ) jointly. The one- 
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vector-symbol algorithm can be easily extended to the two-vector-symbol version by writing 



Vi 




H 0 








71,,; 










+ 








0 H 








. n ' :+ ' 1 












Vi 




H 







With the estimated channel, an estimate of H, denoted as if, can be obtained. Using the 
estimated spatial interference correlation and the known temporal interference correlation, we 
are able to estimate the covariance matrix of n i; denoted as R, Replacing sc t ;, y 7 , H and R in 
the one-vector-symbol algorithm by H and jR, respectively, we obtain the two-vector- 

symbol detection algorithm. 

V. Applications 

In this section, we apply the channel estimation in Section III and data detection in Section IV 
to the case of a single-user link with one dominant co-channel interferer operating at different 
data rates. 

A. System model 

Consider a desired user with one dominant co-channel interferer. The assumption of one co- 
channel interferer can apply to cellular TDMA or FDMA systems when sectoring is used. For 
example, in 7-cell reuse systems, with 60 degree sectors, the number of co-channel interfering 
cells would be reduced to one [21]. We assume that desired and interfering users have N t and 
L transmitting antennas, respectively, and that there are N r receiving antennas. Assuming ther- 
mal noise is small relative to interference, we ignore thermal noise in the problem formulation. 
An investigation of this assumption in channels with noise appears in Section V-D.3. We also 
assume that over the duration of a transmitted frame, a randomly delayed replica of the inter- 
fering signal is transmitted continuously and that the interference statistics do not change. This 
assumption may not hold for asynchronous packet transmission systems. In a slow flat fading 
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environment, the vector signal at receiving antennas is 

I p rp m-i I P T 00 

y(t) = J-jrH Y *k~9(t - kT) + yj-j^Hj Y b k r, T (t - kT, - r) (19) 

* fc— 0 k=—oo 

where M is the frame length, and H (N r x N t ) and Hj (N r x L) are the channel matrices of 
the desired and interfering users, respectively. The channel matrices are also assumed fixed over 
a frame, and have independent realizations from frame to frame*. The data transmission rates of 
the desired and interfering users are 1/T and 1 /Tj, respectively. The spectra of transmit impulse 
responses g(t) and gj(t) are square-root raised cosines with parameters T and T/, respectively. 
The same rolloff factor, is assumed for both g(t) and gi(t). The data vectors of the desired 
and interfering users are x k (N t x 1) and (L x 1), respectively. We assume that data symbols 
in and b^s are mutually independent, zero-mean and with unit variance. We denote P $ and 
Pj as the transmit powers of the desired and interfering users, respectively. The delay of the 
interfering user relative to the desired user is r, assumed to lie in 0 < r < T { . 

Passing y{t) in (19) through a filter matched to the transmit impulse response of the desired 
user, g(t), the vector signal at the output of the matched filter is 

VuAt) = yj-^-H Y, *k9(t - kT) + \j-j^Hj Y b 'Mt - kT, - r) (20) 

/c=0 co 

where g(t) = g(t) * g(t), gi{t) = gj(t) * g(t) 9 and * denotes convolution. As a result, g(t) has a 
raised cosine spectrum and satisfies the Nyquist condition for zero iiitersymbol interference. 

Assuming perfect synchronization for the desired user, as we sample the output of the matched 
filter (20) at time t = jT, we obtain 

yj = JW- Hx i + \/W Hl £ bk9lUT ~ kTl ~ T) ■ (21) 

N v " 

The interference vector rij is zero-mean since the data vector of interferer b k is zero-mean. Note 
that there is no intersymbol interference for the desired user. However, due to the interferer's 
delay and/or mismatch between the transmit and receive impulse responses, intersymbol inter- 
ference exists for the interferer. 
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B. Interference statistics 

The cross-correlation between the interference vectors in (21) at time jT and qT is 

E{ nj nl} 



\k\ = — oo / \/u2— — oo 



P T °° ( i 

= -U-HjH]- J2 {giUT - kT T - r) 9l (qT - kT r - r)y, 



k=—oo 



where the last equality is due to the facts that E ^b kl b\ 2 j =0 for =fi k 2 and E jbfc&J. j = II 
During a training period of TV vector symbols, the covariance matrix of the concatenated 
interference vector ft — [nj • • * ri^_^ has tlie form of (2) where 

oo 

An0',?)= E {9i{jT-kTj-T) gi {qT-kTj-r))., 0 < j t q < N - 1 (22) 

fc= — OO 

and 

- *¥±HjH\. (23) 

The N r x 7V r spatial correlation matrix R is determined by the interferer's channel matrix. The 
N x N temporal correlation matrix depends on parameters T and T/- 9 delay r and pulse 
gi (t); it can be calculated a priori if these parameters are known. The temporal correlation is 
due to intersymbol interference in the sampled interfering signal. We remark that for the case of 
multiple interferers with the same delay, the covariance matrix of interference also has the form 

We study temporal interference correlation in the cases where (1) the interferer has the same 
data rate as that of the desired signal (T = 7j), and (2) the data rate of the desired user is an 
integer multiple of that of the interferer (Tj = mT, in > 1). 

1) Interferer at the same data rate as desired signal: With T = 77, yj (t) has a raised cosine 
spectrum and is given by [22] 

cos(tt pt/T) 



9l (t) = sinc(7rt/T) 



l-ApH 2 /T 2 
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We note that A N (j, q) depends on j — q. This indicates that the sequence consisting of inter- 
ference vectors is stationary. Hence, the temporal correlation matrix is symmetric Toeplitz. By 
appropriate truncation of the infinite series in (22), we can numerically calculate the temporal 
correlation matrix. For the case of (3 = 1 5 T = 1 and r = 0.5, the elements of the temporal 
correlation matrix are 

0.5 j = q 

AjvO". Q) = \ 0.25 \j -q\ = l for 0 < j, q < N - 1. (24) 
0 otherwise 

2) Interferer at a lower data rate than desired signal: It can be shown that <d (/;) is given by 



9i(t) = F- 1 {V G -,T,(./Va c ,r(/)} 



where T~ x denotes the inverse Fourier transform, and G rC) r(/) ' s th e raised cosine Fourier 
spectrum with parameter T and rolloff factor p. Unlike in the case of same-data-rate inter- 
ferer where A^(j, q) depends on j — q, in the case of lower-data-rate interferer, A N (j, q) de- 
pends on the values of j and q. This indicates that the sequence consisting of interference 
vectors is cyclostationary [22][23]. It can be shown that A N (j, q) is periodic with period ?n ? i.e., 
&nUi <l) = &nU + m > Q + m )- As a result, the temporal correlation matrix k N is symmet- 
ric, but not Toeplitz. Furthermore, for N > m, the number of nontrivial eigenvalues of A;v is 
\N/m] where [■] rounds the argument to the nearest integer towards infinity [24]. For the case 
of Tj = 2T, T = 1, p = 1, r = 0.25 and training length iV = 6, by numerical calculation of 
(22) with appropriate series truncation, die temporal correlation matrix is 
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0.400 


0.277 



(25) 
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Note that A 6 in (25) is singular as the number of nontrivial eigenvalues is 3. 



C. Data detection without estimating channel and interference 

During a training period of N symbol vectors, instead of estimating the channel matrix and 
interference statistics, one can alternatively employ a least-squares (LS) estimate of matrix M 
which minimizes the average estimation error 



where sample correlation matrices R xy and Ry V are defined in (16) and (15), respectively. The 
transmitted signal vector Xi is detected as Q (My { ) where Q(-) is Hie slicing operation appro- 
priate to the signal constellation. We remark that (26) is the well-known Direct Matrix Inversion 
(DMI) algorithm [25] generalized for multiple input signals. A significant loss in performance 
is expected for this LS detector since without estimates of channel and spatial interference cor- 
relation matrices, iterative MMSE detection cannot be performed. 

D. Simulation results 

Monte-Carlo simulations are used to assess the benefits of taking temporal and spatial inter- 
ference correlation into account for channel estimation and data detection in the case of one 
interferes Although temporal interference correlation may be difficult to estimate in practice, 
we examine this as a benchmark and determine the performance loss due to ignoring this corre- 
lation. We evaluate average symbol error rates (SERs) in independent Rayleigh fading channels 
of rich scattering, i.e., the elements in channel matrices H and Hj are independent, identi- 
cally distributed (i.i.d.) zero-mean complex Gaussian with unit variance. We assume that the 




By setting df 2 (M)/dM = 0, we obtain 




(26) 
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desired user has 5 transmitting and 5 receiving antennas, and the interfering user has 6 trans- 
mitting antennas 2 . Both the desired and interfering users employ uncoded QPSK modulation. 
The training signal vectors are columns of an FFT matrix [16] to guarantee orthogonal training 
sequences from different transmitting antennas. We define signal-to-interference-power-ratio 
SIR(dB) = 10 log Ps/Pj. Without loss of generality, we set P f = 1 in the simulation. The SERs 
of two cases are simulated: (1) interferer at the same data rate as the desired signal, and (2) the 
data rate of the desired user is twice that of the interferer. 

In Figs. 1 to 4, with solid and dashed lines representing one- and two-vector-symbol data 
detection, respectively, we plot average SERs for the following cases: 

(a) perfectly known channel parameters and interference statistics, with one-vector-symbol 
(curve 1) and two-vector-symbol (curve 2) detection; 

(b) channel and spatial interference correlation matrices are estimated assuming known 
temporal interference correlation, with one- vector-symbol (curve 3) and two- vector- 
symbol (curve 4) detection; 

(c) channel and spatial interference correlation matrices are estimated assuming tempo- 
rally white interference, with one-vector-symbol detection (curve 5); 

(d) only the channel matrix H is estimated assuming temporally white interference; an 
identity spatial interference correlation matrix is used in one-vector-symbol data de- 
tection (curve 6). 

(e) LS estimate of the transmitted signal vector without ordered detection (Section V-C) 
(curve 7). 

We remark that cases (a) and (b) are benchmarks presented for reference while case (d) corre- 
sponds to the well-known BLAST system in [4] [16]. 

1) Interferer at the same data rate as desired signal: We exami ne the case of T — 1, f3 = 1, 
r = 1/2, and the nonsingular temporal interference correlation matrix shown in (24). Figs. 1 
to 2 show the average SERs for training lengths 2N t and 47V t , respectively. Comparing the LS 

2 For a nonsingular spatial interference correlation matrix, we set N r < L. 
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detection (curve 7) with other methods, much lower symbol error rates can be achieved by using 
ordered MMSE detection as expected. 

Comparing curves 5 and 6, we observe that for a training length of 4JV,. symbols, gains can 
be obtained by estimating spatial interference correlation. However, shorter training lengths 
such as 2N t produce inaccurate estimates of spatial interference correlation which in turn do not 
yield any benefit over assuming spatially white interference. As expected, we observe that the 
improvement by taking into account estimated spatial correlation increases with longer training 
lengths. 

Examining curves 3 and 5 in Fig. 2, we observe that the improvement in taking temporal 
interference correlation into account in channel estimation is not significant. Moreover, this 
rate of improvement rapidly diminishes as the training length increases. This can be explained 
by noting that in estimating channel and spatial interference correlation matrices for temporally 
colored interference, the received signal vectors first undergo a transformation which temporally 
whitens the interference vectors as discussed in Section III-C. Since the temporal correlation in 
(24) drops quickly to zero after one time lag, the benefit in temporal whitening of interference 
vectors is not significant, especially for long training lengths. 

By comparing curves 3 and 4 in Fig. 2, there is a slight improvement in using two-vector- 
symbol over one-vector-symbol detection. This implies that not much gain can be achieved 
by taking temporal interference correlation into account in data detection owing to the low 
temporal correlation. Due to better estimates of channel and interference spatial correlation 
matrices obtained with a longer training length, the performance gap between curves 3 and 4 
should increase as the training length increases. 

By comparing curves 4 and 6 in Fig. 2, we observe a 1 .5dB gain in SIR obtained by estimating 
spatial interference correlation and taking explicit advantage of known temporal interference 
correlation in channel estimation and data detection using a training length of 4iV t . About ldB of 
that gain is due to the estimation of spatial interference correlation, and the remaining 0.5dB gain 
is due to exploiting temporal interference correlation in channel estimation and data detection. 
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2) Interferer at a lower data rate than desired signal: We examine the case of Tj = 2T 9 
T = 1, P = 1, r = 0.25 and the temporal interference correlation matrix for training length 
N = 6 shown in (25). Recall that the temporal correlation matrix for the lower-data-rate- 
interferer case is singular. To avoid the singularity, the diagonal elements of A/v are increased 
by a small amount; hence, the temporal correlation matrix used for channel estimation may be 
modified to A^ + 51^ within the proposed framework. In our simulation, we chose S — 0.01. 

The same set of average SER curves as in the same-data-rate-in terferer case are simulated. 
Figs. 3 to 4 show the SERs for different training lengths. As in the case of the same-data-rate 
interferer, curve 7 illustrates the poor performance without ordered detection. Curves 5 and 6 
suggest that for short training lengths it is better to estimate only the channel matrix and assume 
spatially white interference in data detection; however, for moderately long training lengths, 
gains can be obtained by estimating spatial interference correlation. 

By examining curves 3 and 5 in Fig. 4, we observe that the improvement in taking temporal 
interference correlation into account in channel estimation, although larger than that in the same- 
data-rate-interferer case due to the high temporal correlation in the lower-data-rate-interferer 
case, is still not that significant. 

In contrast to the same-data-rate-interferer case, curves 3 and 4 in Fig. 4 show that the 
improvement of two-vector-symbol over one-vector-symbol detection is significant due to the 
higher temporal interference correlation. This implies that significant gain can be achieved by 
taking known temporal interference correlation into account in data detection for the lower-data- 
rate-interferer case. 

By comparing curves 4 and 6 in Fig. 4, for training length AN U there is a total of 4dB gain 
in SIR by estimating spatial interference correlation and taking advantage of known temporal 
interference correlation in channel estimation and data detection. About 3.5dB of the gain is due 
to exploiting temporal interference correlation in channel estimation and data detection. 

3) Effect of model mismatch: With one interferer and a finite S'NR, the interference-plus- 
noise statistics can only be approximately modelled using a Kxonecker product. Here, we inves- 
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tigate when this approximation breaks down. We model thermal noise as a zero-mean circularly 
symmetric complex Gaussian vector with covariance matrix o 2 In v , i.e., independent from an- 
tenna to antenna, with noise power a 2 on each antenna. We define interference-to-noise-power- 
ratio INR = 10 log Pj /a 2 , where Pj = 1 in used in the simulations. For the case of an interferer 
at the same data rate and using a training length 4iV,,, we observe in curves 3 and 5 in Fig. 5 that 
at INRs below 17dB, taking interference temporal correlation into account appears not to be of 
benefit. Fig. 6 shows the corresponding comparison for the case of the lower-data-rate interferer. 
In this case, temporal correlation is larger and the decoupled model of interference-plus-noise 
statistics breaks down at INRs lower than 12dB. 

4) Effect of exploiting spatial interference-plus-noise correlation: From the above results, 
temporal interference correlation, even if known, may not result in a. performance benefit at 
lower INRs due to model mismatch. Therefore, we assess the benefit of taking only the spatial 
correlation of interference-plus-noise into account. As a reference, we compare performance to 
the case of assuming interference-plus-noise to be spatially white. With total interference power 
fixed, Fig. 7 compares the average SER for one (solid line) and two (broken line) interferers. In 
the case of two interferers, the interferers have equal power and random relative delays. Both 
desired and interfering users employ a (5,5) MIMO link, a total-interference-to-noise-ratio of 
12dB, and the training length is AN t . Both the desired and interfering users operate at the same 
data rate. Fig. 7 shows that for one interferer, there is 1 .2dB gain over a wide range of SINRs by 
estimating the spatial correlation of interference-plus-noise. For the case of two equal-powered 
interferers, the corresponding gain in SINR is negligible. 

VI. Conclusions 

By modelling interference statistics as approximately temporally and spatially separable, we 
have investigated maximum-likelihood joint estimation of channel parameters and spatial inter- 
ference correlation matrices. We have assessed the impact of temporal and spatial interference 
correlation on channel estimation and data detection. For training lengths of at least four times 
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the number of transmitting antennas, gains of around 1 dB are observed by estimating spatial in- 
terference correlation. We determine that an additional 0.5 to 3.0 dB in performance gain would 
result if known temporal correlation were exploited- For shorter training lengths, however, it is 
better to estimate only the channel matrix and assume spatially white interference in data detec- 
tion. One source of temporal correlation occurs where a co-channel interferer operates at data 
rate lower than that of the desired user. Exploiting temporal interference correlation in chan- 
nel estimation was found not to be of benefit. However, if temporal correlation is significant, 
as in case of lower-data-rate interference, significant performance gains by exploiting temporal 
interference correlation in data detection are theoretically possible. The minimum interference- 
to-noise (INR) levels where separable temporal and interference correlation statistics model was 
shown to break down and provide no benefit ranged from 12 or 1 7 dB, depending on the level of 
temporal correlation. Of more practical significance, it was shown that at a total INR of 12 dB, 
1.2 dB of performance gain can be obtained over a wide range of SINRs by estimating spatial 
correlation only and neglecting temporal correlation. 
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Fig. 1. Average symbol error rate vs. SIR with N t — N r = 5, L = 6, and training length 2N t . under independent Rayleigh 
fading. Both the desired and interfering users are at the same data rate. 
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Fig. 2. Average symbol error rate vs. SIR with N t = N r = 5, L = 6, and training length 4N t under independent Rayleigh 
fading. Both the desired and interfering users are at the same data rate. 
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Fig. 3. Average symbol error rate vs. SIR with Nt = N r = 5, L — 6, and training length 2N t under independent Rayleigh 
fading. The data rate of the desired user is twice that of interfering user 




Fig. 4. Average symbol error rate vs. SIR with N t — N r = 5, L = 6, and training length 4N t under independent Rayleigh 
fading. The data rate of the desired user is twice that of interfering user. 



22 



10 



-*- est. H&R, spatial & tempo, color, interf., one-vector-symbol (3) 
-*- est. H&R, spatial & tempo, color, interf., two- vector- symbol (4) 
est. H&R, spatial color. & tempo, white interf. (5) 




14 
INR (dB) 



Fig. 5. Average symbol error rate vs. INR with Nt = N r — 5, L = 6, SIR=1 OdB, and training length 4N t under independent 
Rayleigh fading. Both the desired and interfering users are at the same data rate. 
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Fig. 6. Average symbol error rate vs. INR with N t — N T = 5, L — G, SIR=l0dB, and training length 4.N t under independent 
Rayleigh. fading. The data rate of the desired user is twice that of interfering user. 
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Fig. 7. The improvement of estimating spatial correlation of interference-plus-noise in practical systems. With total interference 
power fixed, the solid lines are for one interferer, and the broken lines are for two interfercrs. Both the desired and interfering 
users employ a (5, 5) MIMO link, the same data rate, total -interference-to-noise-ratio of 1 2dB, and training length of 4N t * 
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Abstract- Space-time adaptive processing (STAP) is a key 
technology for improved detection of airbreathing and ground 
moving targets from airborne and spaceborne radar platforms. 
Subspace methods are central to STAP algorithm development and 
provide important insight into adaptive radar signal processing. In 
this paper, we consider unique aspects of subspace methods to 
address sample support requirements and training approaches for 
co variance matrix estimation. Furthermore, we discuss subspace 
characterizations for inhomogeneous signal environments and 
validate the proposed models with measured airborne radar data. 
It is shown herein that inhomogeneity degrades STAP performance 
and that subspace methods intrinsically serve as a cornerstone for 
robust STAP algorithm development. 

I. Introduction 

Space-time adaptive processing (STAP) is a multi- 
dimensional, adaptive filtering approach with application to 
airborne [1] and spaceborne radar, as well as communication 
and sonar systems. In the case of airborne radar, STAP exploits 
diversity among measurements in azimuth, elevation, slow-time 
(Doppler) and fast-time (range) to suppress correlated clutter 
and jamming signals while enhancing target signal components. 
Thus, STAP enables the detection of weak targets. The 
terminology adaptive implies that critical statistical quantities 
are unknown, but are estimated from available secondary data. 
When the secondary data appear inhomogeneous, erroneous 
covariance matrix estimates lead to STAP performance 
degradation with respect to the optimal condition. The 
deleterious effect of inhomogeneity on STAP has previously 
been reported based on the analysis of measured radar data from 
the US Air Force sponsored Multichannel Airborne Radar 
Measurements (MCARM) effort and DARPA's Mountaintop 
program [2-3]. 

While clutter inhomogeneity is a principal concern for 
adaptive airborne surveillance radar, other issues abound in the 
research literature, including the development of novel, robust 
algorithms. Recently reported subspace-based STAP methods 
hold particular interest since such approaches may provide 
results closer to optimal than competing techniques [4-6]. 
Subspace concepts also suggest STAP heuristics with lower 
computational burden [7]. Yet, beyond algorithm development, 



subspace methods possess unique attributes essential for 
advancing adaptive airborne radar. In this paper, we use 
subspace principles to discuss localized training, characterize 
inhomogeneous signal environments and develop techniques to 
enhance STAP capability. 

II. Multichannel Airborne Radar Signal Processing 

A. STAP Basics 

A seminal paper by Brennan and Reed discusses the basic 
theory of STAP for airborne radar [1]. We provide a brief 
overview. Consider a pulse Doppler radar with M spatial 
channels radiating TV pulses. Next, define the vector of all 
complex baseband observations within the coherent processing 
interval (CPI), corresponding to range cell k, as 

*,.*( 2 > •■■ n *,.*(*)] r - (i) 



x s k (n) 6 C Mxl is the spatial snapshot comprised of all 
spatial samples for the n th pulse, whereas we refer to 
x k E C^ rl as the space-time data vector. In STAP, we 
compute the adaptive filter output as 

y k - x k ; w k - a R k } s si (Q } f d ) , (2) 

where w k G C MNx] is the adaptive weight vector, 
R k e C MNx 1 is the sample covariance matrix approximating 
the true, unknown interference covariance matrix, 
R k = E[x k(H ^ x k % Q ] 9 s st (Q,f d ) approximates the target 
steering vector for angle 8 and Doppler f d , and a is a constant. 
The choice for w k in (2) attempts to maximize 
signal-to-interference plus noise ratio (SINR). Generally, one 
employs the maximum likelihood estimate (MLE) 

Vjt */ > (3) 
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to compute the sample co variance matrix. The Reed-Mall ett- 
Brennan (RMB) rule governs the use of (3). 

Reed-Mallett-Brennan (RMB) Rule T81 : Assume the secondary 
data, { x m }* m , , are independent and identically distributed (iid). 
Then, equation (3) minimally requires L ~ 2 * dim (x k ) if the 
expected SINR of the adaptive processor output is to be within 
3 dB of optimal. 

A limited number of potentially inhomogeneous secondary data 
exacerbates use of (3). 

B. Basic Theory of Subspace-Based STAP Techniques 

Subspace methods are appealing because they allow us to 
consider multidimensional signal vectors in terms of a limited 
size, well-defined basis. The eigenvectors of the covariance 
matrix serve as this basis. Observe that R k is Hermitian and 
positive definite. For this reason, a unitary matrix Q k exists 
such that 



R k - Q k A* Q" - £ 

m - 1 

K-diag{k k {m)^ ; ( 4 ) 
Q k -[q k {\) q k (2) ■- q k (MN)} , 

where \{m) is an eigenvalue of R k with related eigenvector 
q k (m). Let k k (l) ± k k (2) * ;> k k (MN) 9 and note that 
{q k {m)]^i defines an orthonormal set. The larger values of 
X k (m) define the interference subspace, whereas the smaller 
values define the noise subspace. The interference-plus-noise 
covariance matrix is generally of low numerical rank [4-6, 7]. 
We may express the optimal weight vector as [4] 



g w - 

m - 1 k k {m) 



(5) 



where 



k Q represents the noise floor eigenvalue, 



STAP algorithms as follows. Principal component approaches 
choose the eigenbasis corresponding to the largest eigenvalues 
to construct the weight vector [4]. In contrast, metric-based 
approaches select the eigenbasis to optimize a given cost 
function [6]. This eigenbasis may span both noise and 
interference subspaces. Principal component inverse methods 
construct a weight vector spanning the noise subspace [5]. 
Since the noise and interference subspaces appear orthogonal, 
constructing the weight vector as a linear combination of noise 
eigenvectors anhilates correlated clutter lying in the orthogonal 
interference subspace. 

III. Localized Training 

As already mentioned, airborne radar signal environments can 
appear quite inhomogeneous. Conditions leading to instances 
of clutter inhomogeneity include shadowing, spatially- varying 
clutter reflectivity and interfaces between dominant clutter types 
(e.g., land-sea interface), clutter discretes, moving scatterers, 
and weather phenomena. Additionally, chaff and coherent 
repeater jamming are cases of induced inhomogeneity. In the 
case of spatially- va lying clutter and interfaces or edges, it is 
desirable to locally train the adaptive processor to best capture 
interference statistics nearest the primary test data. Even if all 
data were iid, the product of M and TV can appear sufficiently 
large such that use of (3) does not yield acceptable performance. 
We may exploit the low numerical rank of the clutter covariance 
matrix to reduce the required sample support. 

Dominant Subspace Estimation : The sample support required to 
estimate the dominant components of the interference 
covariance matrix is on the order of 2 * rank(R k where R kJ is 
the interference covariance matrix. Exact dominant subspace 
information enables optimal performance as seen via (5). 

Two issues to consider are as follows. Computing the 
subspace information can be numerically burdensome. A 
heuristic proposed in [7] ameliorates this situation. Similarly, 
beamspace and Doppler transformations often serve as suitable 
rank-reducing approaches. Diagonal loading can be used to 
improve poor adaptive receive sidelobe response, but this is a 



\i k (m) = s s H t q k (m) is the projection of the m th eigenvector onto 

the quiescent pattern defined by s s H t and Q k Q k = 1^. From 
(5) we observe that the weight vector equals the quiescent 
response minus the weighted sum of eigenvectors. Ideally, 
noise eigenvalues result in no subtraction, whereas interference 
eigenvectors lead to a notch in the quiescent pattern. Often, the 
noise subspace appears perturbed due to the limited sample 
support. This perturbation results in raised sidelobe levels due 
to the addition of random terms in (5). Constructing the weight 
vector from principal eigenvectors, or applying diagonal loading 
to the sample covariance matrix, i.e., adding an identity matrix 
times a constant to R k§ improves the sidelobe response at the 
expense of reduced null depth. 

The recent literature appears to categorize subspace-based 



trade-off since it impacts attainable null depth and the 
appropriate level of loading is heuristic [10]. In Figure 1, we 
compare the eigenspectra of the interference-plus-noise 
covariance matrix for the case of a linear array and varying iid 
sample support. Observe that principal eigenvalues are well 
represented, independent of sample support, whereas estimating 
the noise eigenvalues is a much more difficult task. Figure 2 
shows the adapted patterns for several cases of varying sample 
support. Notice that the poor noise eigenvalue estimates lead to 
perturbed sidelobe levels, as indicated by (5). Choosing only 
principal components to construct the weight vector removes 
raised sidelobe levels. Diagonal loading offers a similar effect 
by compressing the noise eigenvalues and improving the 
covariance matrix condition number. 



In summary, subspace processing requires lower sample 
support, which in turn allows localized processing. This is a 
necessary step towards the best possible performance when 
operating in inhomogeneous clutter scenarios. 
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Fig. 1 . Eigenspectra for varying sample support measured in terms of 
degrees of freedom (DoF). Three noise jammers are present. 
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R k - £ o)(t) s t (t) s k "(t) 
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under simplifying assumptions, in this case, o k (d) and s k (d) 
are unknown variables representing the power and space-time 
steering vector for the ofP principal interference sources. It 
is the variation of o^(C) and s k ($) from realization to 
realization which defines a condition of inhomogeneity. The 
effect is best understood by examining the impact on covariance 
structure. Considering principal eigenvector terms, we see from 
the eigenanalysis equation, 



(7) 



that 



q;(Q,TU(*»«) 



(8) 



Tl,(l,») = ^ (*) > 



for the principal eigenvectors. \($, m ) represents the 
projection of the m" 1 eigenvector onto the V th interference 
steering vector. 

Equation (8) is very interesting, showing that each principal 
eigenvector spans all interference steering vectors. Thus, 
inhomogeneity necessarily affects all eigenvalues and 
eigenvectors to a certain degree. This indicates a change in 
covariance structure. The preceding description of 
inhomogeneity is useful since we can apply it to measured data 
using the singular value decomposition (SVD). 

V. Measured Data: Examples of Inhomogeneity 

We now show some examples using measured data from the 
MCARM program to validate our view of inhomogeneity. The 
measured data is taken from MCARM flight 5, acquisition 575. 
This data is publicly available and may be found at [1 1], along 
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Fig. 2. Adapted receive patterns, corresponding to the example shown in 
Figure 1 , for varying sample support. 



IV. General Multichannel Model For Clutter 
Inhomogeneity 

Using subspace principles, we review a simple model for 
clutter inhomogeneity. This model is discussed in more detail 
in [2]. The low rank clutter-plus-noise covariance matrix, with o M 
representing the noise variance, takes the form 



with more detail on the MCARM system. Figure 3 shows the 
eigenbeams of the spatial covariance matrix in the Doppler filter 
corresponding to a radial velocity of 42 miles per hour for 136 
samples taken from range cell 245 to range cell 285. 
Eigenbeams are essentially Fourier transforms of the covariance 
matrix eigenvectors. As predicted by (8), they commonly point 
towards an interference source. Figure 4 shows the eigenbeams 
for the covariance matrix computed from the subset of 
secondary data, within the original set of 136 range samples, 
determined to be most homogeneous. The test statistic used to 
determine homogeneity is mentioned briefly in the following 
section and in [2, 9]. Observe that the eigenbeam closest to zero 
degrees is absent. We attribute this eigenbeam to moving 
scatterers. In terms of (8), moving scatterers in the secondary 



data set result in an added principle component with eigenvector 
^(0) * s m (Q)f\s m (0)l™, where s m (0) is the added 
steering vector due to the moving scatterer in the secondary 
data, and an eigenvalue determined by the reflected power of the 
inhomogeneity de-weighted by the number of range samples in 
the MLE. Note that adding samples in the MLE does not 
necessarily improve performance because other inhomogeneity 
may enter the secondary data set. 
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Fig. 3. MCARM data. Eigenbeams for block of consecutively selected 
secondary data within Doppler filter 10. 
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(STC), shadowing and edge effects lead to underestimated 
eigenvalue magnitude. In terms of (6-8), the observed 
shadowing leads to underestimated eigenvalues 
X k (m) ~ e ^(772) withe < 1 , and consequently resulting 
in a condition of undernulled clutter. 
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Fig. 5. MCARM Data. Eigenspectra across Doppler filters 9 through 16 

showing impact of shadowing. The legend inputs, "mx", indicate sample 
support in terms of available degrees of freedom. 

VI. Approaches For Robust Performance 

Assumptions intrinsic to STAP development, namely the iid 
assumption of the secondary data set, can be routinely violated 
in practical scenarios due to inhomogeneity. Consider the 
simple test statistic of the form 



= x,„ R T x 



(9) 



Basically, (9) determines how well whitens the input data 
vector and is a chi-squared random variable. We may consider 
data vectors similarly whitened, thus having similar values of 
Y m , to be homogeneous. In terms of eigenstructure, the 
expected value of (9) measures the sum of the ratio of the data 
vector's unknown eigenvalues to the eigenvalues of R T [2]. 
Taking the expected value of (9), after substituting the spectral 
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Fig. 4. MCARM data. Eigenbeams for selectively determined secondary 
data chosen as a subset of the same data swath used to produce Figure 3. 



Figure 5 shows the eigenspectra for the MCARM data over 
Doppler filters 9 through 16 (128 pulse FFT) with varying 
sample support. Secondary data are chosen from a near-in 
range, with the window expanding to accommodate greater 
sample support. Despite the use of sensitivity time control 



decomposition for R T and the Karhunen-Loeve eigenbasis 
representation for x , validates the preceding statement. 
Applying (9) to MCARM data, we find that the choice of 
secondary data strongly influences attainable detection 
performance. This is shown in Figure 6, where the top plot 
shows the output of the adaptive processor (adaptive nulling in 
Doppler filter 10, sometimes called factored time-space (FTS)) 
with continuous weight updates using an MLE computed from 
range cells taken via a symmetric window (SW) about the 
primary data cell. The number of samples selected is twice the 
processor's degrees of freedom. In contrast, the bottom plot 
depicts performance when computing an MLE from an equal 
number of the most homogeneous samples in this same region. 
Equation (9) assesses relative homogeneity of the secondary 



data in this case. In this instance, an injected target at range cell 
290 with - 19 dB SNR is clearly visible when the processor 
carefully selects training data. 

The example in Figure 6 shows that clutter inhomogeneity 
can seriously degrade STAP capability. Other approaches for 
robust STAP performance should be developed to overcome 
some of the effects of clutter inhomogeneity. These approaches 
may not necessarily involve training sample selection. Potential 
approaches include pre-filtering and modification of secondary 
data, use of knowledge bases and application of terrain mapping 
data. It is likely that the best approach depends on the form of 
clutter inhomogeneity. This is a topic for further research. 

VII. Summary 

In this paper we describe the utility of subspace processing in 
light of inhomogeneous clutter. Specifically, through subspace 
techniques, we find that localized training is possible with 
sample support on the order of twice the clutter rank. 
Furthermore, subspace methods provide a framework for 
characterizing clutter inhomogeneity and the resulting changes 
one might expect in covariance matrix structure. Finally, we 
mention an approach for quantifying covariance structure via a 
test statistic that is best understood in terms of eigenanalysis. 
Examples using measured data from the MCARM program 
show that inhomogeneous clutter leads to significantly degraded 
STAP performance. Furthermore, the MCARM data 
corroborates proposed subspace characterizations of clutter 
inhomogeneity. 
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Fig. 6. Example of the influence of sample selection on detection 
performance. "FTS^SW" indicates adaptive nulling in Doppler 10, using 
a symmetric window (SW) to choose secondary data. "FTS_NH" uses 
data dependent secondary data selection to yield improved performance 
of the adaptive filter. The notation "N„msmi" indicates the adaptive filter 
output is normalized by Vs" t R? s tt . 
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Abstract — We propose a novel parametric approach for mod- 
eling, estimation, and detection in space-time adaptive processing 
(STAP) radar systems. The proposed parametric interference mit- 
igation procedures can be applied even when information in only 
a single range gate is available, thus achieving high performance 
gain when the data in the different range gates cannot be assumed 
stationary. The model is based on the Wold-like decomposition 
of two-dimensioanl (2-D) random fields. It is first shown that 
the same parametric model that results from the 2-D Wold-like 
orthogonal decomposition naturally arises as the physical model 
in the problem of space-time processing of airborne radar data. 
We exploit this correspondence to derive computationally efficient 
fully adaptive and partially adaptive detection algorithms. Having 
estimated the models of the noise and interference components 
of the field, the estimated parameters are substituted into the 
parametric expression of the interference-plus-noise covariance 
matrix. Hence, an estimate of the fully adaptive weight vector 
is obtained, and a corresponding test is derived. Moreover, we 
prove that it is sufficient to estimate only the spectral support 
parameters of each interference component in order to obtain a 
projection matrix onto the subspace orthogonal to the interference 
subspace. The resulting partially adaptive detector is simple to 
implement, as only a very small number of unknown parameters 
need to be estimated, rather than the field covariance matrix. 
The performance of the proposed methods is illustrated using 
numerical examples. 

Index Terms — Airborne radar, clutter, detection, evanescent 
fields, interference mitigation, jamming, STAP, two-dimensional 
random fields, Wold decomposition. 

I. Introduction 

WE PROPOSE a new approach for parametric modeling 
and estimation of space-time airborne radar data, based 
on the two-dimensional (2-D) Wold-like decomposition of 
random fields. Most interestingly, the proposed parametric 
estimation algorithms of the interference components pi o vide 
new tools to estimate and mitigate the Doppler ambiguous 
clutter. The algorithms we develop enable estimation of the in- 
terference signals using the observations in only a single range 
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gate. This property makes the proposed method particularly 
suitable for nonstationary clutter and jamming environments. 
Our modeling approach also provides a new analytical insight 
into the space-time adaptive processing (STAP) problem. 

The goal of STAP is to manipulate the available data to 
achieve high gain at the target's angle and Doppler and maximal 
mitigation along both the jamming and clutter lines. Because 
the interference covariance matrix is unknown a priori, it is 
typically estimated using sample co variances obtained from 
averaging over a few range gates. Next, a weight vector is 
computed from the inverse of the sample covariance matrix 
[l]-[5]. It is shown in [6] that the dominant eigenvectors of 
the space-time covariance matrix contain all the information 
required to mitigate the interference. Thus, the weight vector is 
constrained to be in the subspace orthogonal to the dominant 
eigenvectors. In [8], a reduced -rank constant false alarm 
(CFAR) detection test is developed, assuming the dominant 
eigenvectors of the interference are known, and in [9], a 
multistage partially adaptive CFAR detection algorithm is 
introduced. In [17], an approach that bypasses the need to 
estimate the covariance matrix is presented: The data collected 
in a single range gate is employed to obtain a least-squares 
estimate of the signal power at each hypothesized direction 
of arrival, through evaluation of a weight vector constrained 
to null the unknown interference and noise. In [18], a simple 
ad hoc model of the clutter signal and covariance matrix is 
proposed. The model represents the spectral density of the 
clutter as a sum of Gaussian-shaped humps along the support 
of the clutter ridge. In [19], this model is employed to estimate 
the clutter covariance matrix from the data observed in a single 
range gate. 

In this paper, we adopt the 2-D Wold-like decomposition 
of random fields [1UJ as the parametric model of the observed 
data. Employing this model, we derive computationally effi- 
cient algorithms useful for parametrically estimating both the 
jamming and clutter fields. The estimation procedure we pro- 
pose is capable of estimating the interference parameters from 
the information in a single range gate. Hence, no averaging 
over a few range gates is required. This property provides 
significant advantage in the practical case where data in the 
different range gates is nonstationary. Having estimated the 
interference terms parametric models, their covariance matrix 
can be evaluated based on the estimated parameters. Moreover, 
the problem of evaluating the rank of the low-rank covariance 
matrix of the interference is solved as a byproduct of obtaining 
the parametric estimates of the interference components. 
Once the parametric models of the interference components 
have been estimated, several alternative detection procedures 
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are available. In this paper, we present two such methods: 
the parametric fully-adaptive processing and the parametric 
partially-adaptive processing. 

The paper is organized as follows. In Section II, we briefly 
summarize the main results of the 2-D Wold-like decomposi- 
tion and the resulting random field model. Next, in Section III, 
the correspondence between this model and the physical model 
of the STAP data is identified. In Section IV, we elaborate on 
the parametric representation of the covariances of the different 
components of the random field. The estimation algorithm of 
the random field parametric model is presented and analyzed 
in Section V. After the method for estimating the parametric 
models of the different components of the data field has been 
established, we present the parametric fully adaptive processing 
method and the computationally more efficient parametric 
partially adaptive processing method in Sections VI and VII, 
respectively. The performance of both methods is illustrated 
using synthetic data examples. We summarize our conclusions 
in Section VIII. 

II. Random Field Model 

In this section, we briefly review the 2-D Wold-like decom- 
position of random fields and the resulting random field model. 
In the next section, the applicability of this model to STAP data 
will be explained. It is shown in [1 0] that any 2-D regular and ho- 
mogeneous discrete random field can be represented as a sum of 
two mutually orthogonal components: a purely indeterministic 
(unpredictable in the mean-square sense) field and a determin- 
istic (predictable in the mean-square sense) one. The purely in- 
deterministic component has a unique white innovations driven 
nonsymmetrical half-plane (NSHP) moving average represen- 
tation. The deterministic component is further orthogonally de- 
composed into a harmonic field and a countable number of mu- 
tually orthogonal evanescent fields. This decomposition results 
in a corresponding decomposition of the spectral measure of the 
regular random field into a countable sum of mutually singular 
spectral measures. The purely indeterministic component has an 
absolutely continuous spectral distribution function. The spec- 
tral measure of the deterministic component is singular with re- 
spect to the Lebesgue measure, and therefore, it is concentrated 
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where T,(aA))<(k,e) l > 2 ( k - f >) < oo; &(().,()) = 1, and 
{u(n y ra,)} is the innovations field of \y{n. : ra)}. The notation 
•< implies that the weighted summation includes u(n, m) and 
all the samples in its ik past," where the past is defined with 
respect to any selected NSHP total -ordering on the 2-D lattice 
(see, for example, Fig. 1). 

We call a 2-D deterministic random field [e 0 (n, ra)] evanes- 
cent w.r.t. the NSHP total-order o if it spans a Hilbert space iden- 
tical to the one spanned by its column-to-column innovations 
at each coordinate (n, ra.) (w.r.t. the total-order o). The deter- 
ministic field column-to-col umn innovation at each coordinate 
(77,, m) G Z 2 is defined as the difference between the actual 
value of the field and its projection on the Hilbert space spanned 
by the deterministic field samples in all previous columns. 

It is possible to define [ 1 0] a family of NSHP total -order def- 
initions such that the boundary line of the NSHP has a rational 
slope. A NSHP of this type is called rational nonsymmetrical 
half-plane (RNSHP), (see, for example, Fig. 1). Let a and b be 
two coprime integers, such that both a, 6 / 0. The slope of the 
RNSHP is then given by -a/ft (and cot 0 — —hi a). For the case 



on a set of Lebesgue measure zero in the frequency plane. It is 
shown in [12] that under some mild assumptions (that always 
hold in practice), the spectral supports of the different evanes- 
cent components have the form of lines whose slope is a rational 
number. 

More specifically, let {y(n, m), (ra, m) G Z 2 } be a complex 
valued, regular, homogeneous random field. Then, y(n, m) can 
be uniquely represented by the orthogonal decomposition 



7/(n, m) = w(n, m) -f v(n, m). 



(i) 



where a — 0, the RNSHP is uniquely defined by setting 6 = 1. 
(For the case where 6 = 0, the RNSHP is uniquely defined by 
setting a - 1.) We denote by O the set of all possible RNSHP 
definitions on the 2-D lattice (i.e., the set of all NSHP defini- 
tions in which the boundary line of the NSHP has a rational 
slope). The introduction of the family of RNSHP total-ordering 
definitions results in the following countably infinite orthogonal 
decomposition of the deterministic component of the random 
field: 



The field {v(n, m)} is a deterministic random field. The field 
{w(n, m)} is purely indeterministic and has a unique white in- 
novations driven moving average representation, which is given 
by 



v(n, m) = p(n, m) 4- ^ fi («, <>)( n > m )* 



(3) 



u>(n, m) = ^ b(k, l)u{n - k, m — t) 
(o.o)^(M) 



(2) 



The random field {p(n, -fa)} is half-plane deterministic, i.e., it 
has no column-to-column innovations w.r.t. any RNSHP total- 
ordering definition. The field {c( n J}) (n. ra)} is the evanescent 
component that generates the col umn -to- col umn innovations of 
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the deterministic field w.r.t. the RNSHP total-ordering definition 
(a, b)£0. 

Hence, if {y(n, m)} is a 2-D regular and homogeneous 
random field, then y{n^ m) can be uniquely represented by the 
orthogonal decomposition 

y(n, m) = tu(n, m) +p(n 3 m) + ]T e (0)6 )(n, m). (4) 

(o,6)€0 

In the following, all spectral measures are defined on the 
square K = [-1/2, 1/2] x [-1/2, 1/2]. It is shown in [10] 
and [11] that the spectral measures of the decomposition com- 
ponents in (4) are mutually singular. The spectral distribution 
function of the purely indeterministic component is absolutely 
continuous, whereas the spectral measures of the half-plane de- 
terministic component and of all the evanescent components are 
concentrated on a set of Lebesgue measure zero in K. A model 
for the evanescent field that corresponds to the RNSHP defined 
by (a, b) £ O is given by 

/<«.*>) 

H*,b)(n,m)= 51 ef a,b) (n, m) 

= 5^ s[ a ' b \na + m.b) 
i=i 

• exp {j2-Kv\ a ' b ^ (nc -f md)^ (5) 

where c and d are coprime integers satisfying ad - be = 1. For 
the case where (a, 6) = (0, 1), we have (c, d) = (1, 0), and for 
(a, 6) = (1, 0), we have (c, d) = (0, 1). The 1-D purely inde- 
terministic, complex- valued processes {s[ a,h \na -f mb)} and 
are zero-mean and mutually orthogonal for 
all i ^ j. Hence, the "spectral density function" of each evanes- 
cent field has the form of a sum of 1-D delta functions that are 
supported on lines of rational slope in the 2-D spectral domain. 
The amplitude of each of these delta functions is determined by 
the spectral density of the 1-D modulating process, Since the 
spectral density of the modulating process can rapidly decay 
to zero, so will the "spectral density" of the evanescent field, 
hence, the name "evanescent." Since interchanging the roles of 
past and future in any total-order definition amounts to substi- 
tuting i/ a ' 6 ^ by -i/| a '^ in the model (5), we assume without 
limiting the generality of the derivation that a > 0, and b can 
assume any integer value. 

One of the half-plane-deterministic field components, which 
is of prime importance in the STAP problem is the harmonic 
random field 

p 

h(n, m) = ^TC P exp {j2^{nw p + rnv p )) (6) 
P =i 

where the C p s are mutually orthogonal random variables, and 
(lj p , y p ) are the spatial frequencies of the pth harmonic. 



III. STAP Model and the 2-D Wold Decomposition 

The random field parametric model that results from the 
2-D Wold-like orthogonal decomposition naturally arises as 
the physical model in the problem of space-time processing of 
airborne radar data. Let n denote the sensor index, and let m 
be the time index. In the STAP problem, the target signal is 
modeled as a random amplitude complex exponential where the 
exponential is defined by a space-time steering vector that has 
the target's angle and Doppler. In other words, in the space-time 
domain the target model is that of a 2-D harmonic component 
similar to (6). The sum of the white noise field due to the 
internally generated receiver amplifier noise, and the colored 
noise field due to the sky noise contribution, is the purely 
indeterministic component of the space-time field decompo- 
sition. The presence of a jammer results in a barrage of noise 
localized in angle and uniformly distributed over all Doppler 
frequencies. Hence, in the space- time domain, each jammer 
is modeled as an evanescent component with (a, 6) = (0, 1) 
such that its 1-D modulating process sf ),1 \m) is the random 
process of the jammer amplitudes. The jammer samples from 
different pulses are uncorrelated. In the angle-Doppler domain 
each jammer contributes a 1-D delta function, parallel to the 
Doppler axis and located at a specific angle vf* ^ [using the 
notation of (5)]. The ground clutter results in an additional 
evanescent component of the observed 2-D space-time field. 
The clutter's echo from a single ground patch has a Doppler 
frequency that linearly depends on its aspect with respect to 
the platform. Hence, clutter from all angles lies in a "clutter 
ridge," which is supported on a diagonal line (that generally 
wraps around in Doppler) in the angle-Doppler domain. A 
model of the clutter field is then given by (5) with the slope 
of the clutter ridge given by b/a and with s\ a * b \na + mb) 
being a 1-D colored noise process. Since the rational numbers 
are dense in the set of real numbers, an irrational slope of the 
clutter ridge can be approximated arbitrarily close by a rational 
one. Hence, any clutter signal can be either exactly modeled or 
approximated by an evanescent field. 

Fig. 2 graphically illustrates a typical example of the 
matching between the 2-D Wold decomposition based para- 
metric random field model and the physical model of STAP 
data. In this synthetic example, the observed random field is 
the sum of two evanescent components that correspond to the 
clutter component with (a. b) = (1, 2), jA 1j2 ^ = 0 and a 
jammer with z/ 0, ^ — 0.2. Fig. 2 depicts the magnitude of the 
DFT of the observed field. 

We therefore conclude that the foregoing derivation opens the 
way for new parametric solutions that can simplify and improve 
existing methods of STAP. 



IV. Covariance Structure of the Observed Field 

Based on the random field model derived in the previous sec- 
tions, we derive in this section a closed-fonn parametric ex- 
pression for the covariance matrix of the observed STAP data 
field in terms of the model parameters. We begin by stating our 
assumptions. 
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Spatial Frequency 

Fig. 2. Magnitude of the DFT of an observed field containing two evanescent 
components that correspond to a clutter component with (a, b) = (1,2), 
i/t 1 - 2 * = 0, and a jammer with i/* 0 * 1 ) = 0.2. 

Let {y(n, m)}, (n, m) € D 9 where D — j)|0 < i < 
S - 1, 0 < j < T - 1}, be the observed random field. 

Assumption 1: The purely indeterministic component 
{w(n } m)} is a zero mean circular complex valued random 
field. 

Assumption 2: The number / = J2( a , b)eo &) of evanes ~ 
cent components in the field is a priori known. This assumption 
can be later relaxed. 

Assumption 3: For each evanescent field {e t - a '^} s the 
modulating 1-D purely indeterministic process {s^ a '^} is a 
zero-mean circular complex valued process. 

Let 

y = [7/(0,0),...,7/(0,T-l) ) 7/(l,0),...,?y(l,T-l) 

... ) ...,y(5-l ) 0) ) ... l y(S-l,r-l)] r (7) 
w = [™(0,0), . . . ^(0^-1)^(1,0), . . . T-l) 

.,w(g-i,o),...^(5-i,r-i)r — $y 

. . . , e^ b) (l, 1), ....... , e<^ 6) (^- 1, 0) 

...,ej" ,6) (S-l,T-l)] . (9) 

Let 

f (..6> = [ 5 («.» (0) , s ^»)( b ), .... S ^"\(T- 1)6) 

S <°' 6) (a), S | a '% + 6), S (°' 6 )(a + (r- 1)6) 
...,s^ b \(S-l)a),s^ b) ((S-l)a + b) 

... )5 ,( a ' 6) ((5-l)a + (T-l)6)] T (10) 



be the vector whose elements are the observed samples from the 
1-D modulating process {*,■"'' ''^}. Define 

v( a '*> = [0, d, (r- l)d, c,c + d,... 

c + (T- l)d, (S-l)c, (5-l)c + d 

...,(S-l)c+(T-l)rif . (11) 

Given a scalar function f(v), we will denote the matrix, or 
column vector, consisting of the values of f(v) evaluated for all 
the elements of v, where v is a matrix, or a column vector, by 
/(v). Using this notation, we define 

dl'-'^oxp^^'V"' 6 )). (12) 

Thus, using (5), we have that 

e (M) =^ a ' 6 > 0 d^ 6) (13) 

where 0 denotes an element-by-element product of the vectors. 

Note that whenever na + rab — ka -f th for some integers 
n, m, fc, £ such that 0 < n, fc < 5 - 1 and 0 < m, £ < 
T — 1, the same sample from the modulating process {s - a ' b ^ } is 
duplicated in the elements of h \ It is shown in [15] that for 
a rectangular observed field of dimensions S x T, the number 
of distinct samples from the random process {s^'^} that are 
found in the observed field is 

N c = (S - l)\a\ + (T - l)\b\ + 1 - (|a| - 1)(|6| - 1). (14) 

This is because N c is the number of different "columns" one 
can define on such a rectangular lattice for a RNSHP defined 
by (a, b). We note here that in the special case where a = 1, 
(14) provides the well-known Brennan rule [3] on the rank of 
the clutter covariance matrix. 

We therefore define the concentrated version s[ aib ^ of £^ a ' h ^ 
to be an 7V r -dimensional column vector of nonrepeating sam- 
ples of the process {i--"'^}. More specifically, for the case in 
which a > 0 and b < 0, s[ n "' ^ is given by 

s <«,» = r a (-.*) ((T _ l)b), s<?-"\(S - l)a)f 

~b ^ 

whereas for the case in which a > 0 and b > 0, s^°' ^ is given 

by 

sj M> =[ a | Oli) (0),... 

...,,^' ,) ((5-l)a + 6(T-l))] T . (16) 
Thus, for any (a, b), we have that 

where A- a ' ^ is rectangular matrix of zeros and ones that repli- 
cates rows of s[ a ' b \ 

Note, however, that due to boundary effects, the vector s^ a ' ^ 
is not composed of consecutive samples from the process 
{.s* a,b *} unless |a| < 1 or |b| < 1. In other words, for some 
arbitrary a and 6, there are missing samples in s^ a ' b \ 
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We note that the covariance matrix R,[ a,b \ which character- 
izes the second-order properties of the process {$- a>6 ^}, is de- 
fined in terms of the concentrated version vector s,- ai b ^ 



5 (a,6) 



K^ b) = E 
and not in terms of the covariance matrix 
Rj°' 6) = E 



i.e., 
(18) 



(19 ) 



of the vector f b ^ . The matrix R-°' b ^ is a singular matrix, 

where R^ b) = A< B ' fc) R< B,6) (Aj a>6) )* 

Since the evanescent components {e[ a ' ^} are mutually or- 
thogonal and since all the evanescent components are orthog- 
onal to the purely indeterministic component, we conclude that 
r, which is the covariance matrix of y, has the form 



(a,6)eO »=1 

where 1^°' b ^ is the covariance matrix of e^ a> b \ 
Using (5) and (13), we find that 

r ( a ,6) = ^ A ( fl ,6) R (a,6) ( A (°,*))^ 



(20) 



(21) 



A compact matrix representation of r\ a,b ^ for any (a, b) 
cannot be derived due to the dependence of the matrix structure 
on (a, b). However, for the case in which (a, b) — (0, 1) 
(and similarly for (a, b) = (1, 0)), a somewhat more compact 
representation is possible, using Kronecker products instead of 
the Hadamard products. 



More specifically, for this special case, (13) can be expressed 
in the form 



ef , - ]) =df , ' 1 )®s^ ,) 



(22) 



where <g> is the Kronecker product. Hence 

lf'V= (df^ (df^^^Rf' 1 ) 

= E| (U >®Bf 1 > (23) 

where R T - 0, ^ and E 7 - 0;1) are Toeplitz matrices, given by (24) 
and (25), as shown at the bottom of the page. 

V. Parametric Estimation of the 
Interference Components 

In this section, we derive a computationally efficient algo- 
rithm for estimating both the jamming and clutter fields, based 
on the above results. More specifically, for each interference 
component of the observed field, we estimate its spectral sup- 
port parameters a, ft, iA n - ^ as well as c, d and the parametric 
model of the modulating 1-D purely indeterministic process 
'^K 1° tne setting of the radar problem considered here, 
partial information on the different components of the field is 
a priori known: The jamming signals are localized in angle 
and distributed over all Doppler frequencies. Thus, each jammer 
contributes an evanescent component with spectral support pa- 
rameters (a, /;) = (0, 1) and an unknown frequency i/f 0, 1 \ The 
clutter signal is also modeled as an evanescent component with 
v {a,h) _ q an( j an un k nown /,) p a i^ which is uniquely de- 
termined by the platform motion parameters. 

The proposed estimation algorithm of the spectral support pa- 
rameters of the evanescent field a, b and v^ a * ^ is based on the 
following lemma. 

Lemma 1: Let {a) a "' h \n, m)} be an evanescent field and let 
k be an integer. The samples of the evanescent field along a line 







rf^f-l) • 


.. r <M) f _ fT _ i^l 










•• rf X) (-(r-2)) 










r-? U) (-l) 


(24) 






r{ 0ll) (r-2) • 


•• -'f x) (o) . 





E 



(o,i) _ 



exp \j 



oxp (-j2irvf' 1 ^ 



oxp(-i27r(S-l),/f U) ) 
ex 1 ,(-:/27r(S-2),vf' 1) ) 



exp (j2*{S - J) ) exp {j2n(S - 2)i/f *>) 



(25) 
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on the sampling grid defined by k = na + mb are the samples 
of a 1-D constant amplitude harmonic signal, whose frequency 

(a, b) 
IS V> 

Proof: Since for fixed a, ft, k, k = na + mb is the linear 
Diophantine equation (see the Appendix), its solutions are given 
by 

n=n k + tb (26) 
m = mjt — ta (27) 

where (n*, rrik) is a solution of the equation, and t is an integer 
such that the sequence of consecutive values of t corresponds 
to the different lattice point on the line k — na + mb. From 
(5), we have, for the evanescent field samples along the line 
k = na -f mb 

e\ ; (n, m) 

= 5- a ' b '(na -f mft) exp ^j27ri/ t - a * b ^(nc -f md) j 

= 5^ a ' ^(A;) exp ^*27rzy^ ttT b \n k c -f m^c/ — i(ad — ftc))^ 

= [^ a,b) (A:)exp (j27r^ a ' b) (n fc c + m fc (i))] 

.exp(-j27r^ a ' fc) t) " (28) 

where the last equality is because c, d are coprime integers such 
that ad - be — 1. Hence, in each realization and for a fixed k, 
s\ a ' b \k) cxp(j27ri>i a ' b \n k c + m k d)) is a (random) constant. 
Hence, the proof follows. ■ 
The algorithm is implemented by the following four-step 
procedure: 

Initial estimation of a and b: In the presence of an evanes- 
cent component, the peaks of the observed field periodogram are 
concentrated along a straight line such that its slope is defined 
by the two coprime integers a and b. Hence, several alternative 
approaches for obtaining an initial estimate of the spectral sup- 
port parameters of the evanescent component can be derived by 
taking the Radon or Hough transforms [20] of the observed field 
periodogram. (The current implementation employs the Hough 
transform for detecting straight lines in 2-D arrays). However, 
due to noise presence, this estimate may perturb. Since, on a fi- 
nite-dimension observed field, only a finite number of possible 
(a, b) pairs may be defined, the output of the initial stage is a set 
of possible (a, b) pairs such that the ratio b/a is close to the ratio 
obtained for the (a, b) pair estimated by the Hough transform. 

Estimation of the frequency parameter of the evanescent 
component: For each possible (a, b) pair, we next evaluate the 
frequency parameter of the evanescent component v\ a ^ . As- 
suming the considered (a, b) pair is the correct one, we have, 
from Lemma 1 , that in the absence of background noise for a 
fixed k = na + mb (i.e., along a line on the sampling grid), the 
samples of the evanescent component are the samples of a 1-D 
constant amplitude harmonic signal, whose frequency is v\ a ' b \ 
Hence, by considering the samples along such a line, we ob- 
tain samples of a 1-D constant amplitude harmonic signal whose 
frequency v\ a ' b ^ can be easily estimated using any standard fre- 
quency estimation algorithm (e.g., the 1-D DFT). 



Final estimation of the spectral support parameters of 
each evanescent component: The test for detecting the cor- 
rect (a, b) and u\ (t " b ^ is then based on multiplying the observed 
signal ?/(n, m) by oxp(— 7'27r;>-"'' b \nc + md)) for each of the 
considered a, ft and //• triplets and evaluating the variance 
of this signal along a line on the sampling grid such that k = 
na + mb. Clearly, the best estimate of a, ft, and v\ a,h ^ is the 
one that results in minimal variance for the 1-D sequence be- 
cause in the absence of noise, the correct a, ft, and u\ aih ^ result 
in a zero variance. Note that c, d are two coprime integers sat- 
isfying the linear Diophantine equation ad - be =. 1 when a, ft 
are replaced by their estimated values. Clearly, c, d obtained 
as solutions to the linear Diophantine equation are not unique 
(see the Appendix). The correct pair c, d is then determined by 
employing the symmetry properties of the field covariance se- 
quence (see [12] for details). Since, in the STAP problem, it is 
a priori known that for the jammers (a. ft) = (0, 1), whereas 
for the clutter i/ a » 6 ) = 0, the parameters e, d do not appear in 
the model and, hence, need not be estimated. Nevertheless, to 
maintain the generality of the algorithm description, we proceed 
for the final step of the algorithm with the general description, 
assuming c, d have been estimated (or are a-priori known as in 
the STAP case). 

Estimating the model of the 1-D purely indeterministic 
modulating process of the evanescent Held: Having estimated 
the spectral support parameters of each evanescent component, 
we take the approach of first estimating a nonparametric repre- 
sentation of its 1-D purely indeterministic modulating process 
{s[ a,h ^}, and only at a second stage do we estimate the para- 
metric models of these processes. Hence, in the first stage, we 
estimate the particular values that the vectors ^[ a> b ^ take for the 
given realization, i.e., we treat these as unknown constants. The 
estimation procedure is implemented as follows: Multiplying 
the observed signal y(n, m) by oxp(-y'27ri>f rt) h \nc + md)) 
and evaluating the arithmetic mean of this signal along a line 
on the sampling grid such that k = na -f mb, we have 

s i"' 6) ( fc ) = ^ E :"('"-. »<•) 

<-?xp ( — j 2ir i/ - n b) {ne + md)) (29) 

where N s denotes the number of the observed field samples that 
satisfy the relation vm+mft — k. Once we obtained the sequence 
of estimated samples from the 1-D modulating process {s\ a,b>} }, 
the problem of estimating its parametric model becomes entirely 
a 1-D estimation problem. Assuming the modulating process is 
an autoregressive (AR) process and applying to the sequence an 
AR estimation algorithm (see, e.g., [21]), we obtain estimates 
of the modulating process parameters as well. 

Finally, it is important to note that we solve the difficult 
problem of evaluating the rank of the low-rank covariance 
matrix of the interference as a byproduct of obtaining the 
parametric estimates of the interference components: De- 
note the number of evanescent components (interference 
sources) of the field by Q. It is then shown in [16] that 
the rank of the interference covariance matrix is given by 
S£ti Kl + TEL M - ELi M ELi In fact, 
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the special case where Q = I and a = 1 is the well-known 
Brennan rule [3] on the rank of the clutter covariance matrix. 
Hence, following the estimation of the spectral support param- 
eters of the different evanescent components, the rank of the 
interference covariance matrix is also determined. 

VI. Parametric Fully Adaptive Processing 

Having estimated the parametric models of the purely inde- 
terministic and evanescent components of the field, the esti- 
mated parameters can be substituted into (20) and (21) to obtain 
an estimate of the interference-plus-noise covariance matrix T. 
In this section, we show how the estimated interference-plus- 
noise covariance matrix is employed to obtain a fully adaptive 
space-time filter. 

Let v £ denote the target steering vector given by 

v t = b(o7t)®a(i?t). (30) 

Assuming a linear, uniformly spaced, sensor array and a uniform 
coherent-processing interval (CPI) are employed in our model, 
the spatial steering vector a(i?) and the temporal steering vector 
b(w) are given by 

a(tf)=[l, e^, ..^e^-^T 

b(tu)=[l, e* 2 ™ 7 , eiMs-i)~]T 

respectively. Assume for the moment that only a single target 
may exist in the observed data and that both the target's steering 
vector and the interference-plus-noise covariance matrix T are 
known. We next derive a fully adaptive detection algorithm 
based on the generalized likelihood ratio test (GLRT). Since v t 
and T are assumed known, the GLR has to be maximized only 
with respect to C t , which is the unknown amplitude parameter 
of the target. Thus, the GLR has the form 

max Ct Py\n x {y\ C t \H{) 



A = 



(31) 



Py\n 0 (y\^o) 

Following a standard procedure (see, e.g., [7] and [9]), the GLR 
test statistic, which we denote by |z(g7, #)| 2 , can be shown to 
have the equivalent form 



\z(wt, tft)| 2 = 



Ivfr 



•VI 2 



vfr-!v t 



(32) 



statistic evaluated at this frequency against the threshold. Thus, 
the GLRT when T is perfectly known is given by 



in ax \z{vs % v9)|" 



(35) 



In other words, in the case of a known covariance matrix, the test 
is equivalent to finding the 2-D frequency where the magnitude 
ofthe2-D DFTof T is maximal, followed by comparison of the 
value of the test statistic at this frequency against the threshold. 

Note that under both the null hypothesis (no target) Hq as 
well as under the alternative hypothesis H\, h H (w) ®a H {'d)^ 
is a Gaussian random variable, being a linear transformation of 
a Gaussian random vector. Assuming T is perfectly known, it 
is not difficult to show [ 1 3] that after prewhitening by r~( 1/2 ^, 
the probability density function of the GLRT in (35) is x 2 dis- 
tributed with two degrees of freedom under Ho and noncentral 
X 2 with two degrees of freedom under H-\ . 

Finally, since T is also unknown, we adopt an approach sim- 
ilar to that employed in the derivation of the adaptive match 
filter (AMF) in [7] and substitute the unknown covariance ma- 
trix with its estimate, which is obtained as explained in the pre- 
vious sections. 

To illustrate the operation of the proposed solution, we 
resort to numerical evaluation of some specific examples 
(see [13] for a detailed performance analysis and additional 
examples). Consider a 2-D observed random field consisting 
of a sum of a purely indeterministic component (background 
noise), a single evanescent (interference) component, and three 
hannonic components (targets). The purely indeterministic 
component is a complex valued circular Gaussian white noise 
field. The evanescent component spectral support parameters 
are (a, b) = (1. 2), // 1;2) = 0. The modulating 1-D purely 
indeterministic process of this evanescent component is a 
first-order Gaussian AR process, with driving noise variance 
( a (i,2))2 _ 2 and a (1;2) (l) = -0.5. There are three targets 
that are located at (0.05, 0), (0.15, 0.15), and (-0.25, 0.15), 
respectively. The observed field dimensions are 48 x 48. 

Let us define the power of each of the field components 
as E w = w^w for the purely indeterministic component; 
E e = (e( a,b )) H e( a * h ) for the evanescent component; and 
E hh — hj?*h;., k — 'I. 2, a for each of the harmonic com- 



ponents, where is defined in the same way w and e^ a >^ 
are defined. In this example, we have E e JE w — 6dB, 
whereas for the three targets, we have /E w = — 12.8dB, 
E h JE m = -14.5 dB, Eh JE u , = -15 dB. Due to the strong 
interference component, the presence of the three targets is hard 
to detect in the observed data whose power spectral density is 
depicted in Fig. 3. However, these targets are easily detected 
by the test statistic \z(m, i9)|, depicted in Fig. 4. In Fig. 4, 
\z(m, ?9)| is depicted as a function of the 2-D frequencies, i.e., 
angle and Doppler. 

VII. Parametric Partially Adaptive Processing 

The low rank of the interference covariance matrix is ex- 
ploited in the partially adaptive STAP to significantly reduce 
the adaptive problem dimensionality. In this section, we derive a 
partially adaptive processing algorithm, based on the estimated 
parametric model of the interference. Moreover, it is proved in 



Let * = r _1 y- w e thus have 

R^ti v t )\ = 



Reorganizing the elements of * into a S x T matrix T where the 
elements of the fcth row of T are - 1)T + 1) ■ • • *(fcT), 
we conclude that for a linear, uniformly spaced, sensor array and 
uniform CPI 

b H (n7)<g>a"(<i?)* 



(33) 



e -j27r(p-l)tu e ~j27r( (7 -l)^- 



'T(p,«). (34) 

Thus, b H (w) <g> a* (i?)* and Y are a 2-D DFT pair. However, 
since in fact the steering vector is unknown, the detector must 
first estimate the frequency where the magnitude of the 2-D DFT 
of T is maximal, followed by comparison of the value of the test 
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Fig. 3. Power spectral density of the observed field. 
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Fig. 4. Test statistic \z(w t t?)|. 



this section that in order to implement the proposed partially 
adaptive processing method, only the spectral support parame- 
ters of the interference need to be estimated, and there is no need 
whatsoever to estimate the modulating process of the interfer- 
ence model, nor the data covariance matrix. 
More specifically, recall that 

Having estimated a, b and v\ a " h ^ using the algorithm in Sec- 
tion V, the vector d^ a ' b ^ is known. Hence, demodulating e- a ' b \ 
we conclude using (13) that 

f (a )& ) =e (a ) 6) 0 ^ (d (a,o) ) H^ T (3?) 



However, from (17), we conclude that the covariance matrix of 
£- a ' 6 ^ is given by 

R(a,6) = A K6) R (a,6) ( A? K ^ (3g) 

In the following, we prove that since a and b are already 
known, an orthogonal projection matrix onto the low-rank 
subspace spanned by the evanescent field covariance matrix 
can be found without estimating the parametric model of the 
evanescent field 1-D modulating process and, hence, without 
estimating R- a,b *. Moreover, this result enables us to avoid 
the need in both evaluating the field covariance matrix and in 
employing a computationally intensive eigenanalysis to the 
estimated covariance matrix. More specifically, let us construct 
the following orthogonal projection matrix: 

T (a,b) = A (.,») ^ A K»>) T ( A P>) T . (39) 

It is easily verified (by substitution) that T [" is an orthogonal 
projection onto the range space ofRf 1 ^ since for any ST-di- 
mensional vector v 

K^ b K = Rj 0,fc) lf' 6) v = Tj tt ' h) R[ a ' 6) v. (40) 

In addition, (X< a ' b) ) 2 = T< n ' 6) , and (T[ a ' b) ) T = if 

Note that since A- a,/ ^ is a sparse matrix of zeros and 
ones only, the computation of T,-' 1 '^ is very simple. The 
projection matrix onto the subspace orthogonal to the inter- 
ference space is therefore given by (xj;"' 1 ^) 1 - = I — x[ a ' b \ 
Hence, by projecting the demodulated observed data vector 
y = y 0 ({d\ aib ^) H ) T onto the subspace orthogonal to the 
interference subspace, a reduced-dimension data vector given 
by y = (T^ a,fc ')- L y is obtained, such that the interference 
contribution to the observed signal is mitigated. Remodulating 
y by evaluating y 0 d-"' /j \ followed by sequentially applying 
this procedure to mitigate each of the interference sources, the 
detection problem is reduced to that of detecting a target in 
the presence of background noise only. Following a similar 
derivation to the one in (3 1)— (35), we conclude that in the 
special case where the background noise is known to be a 
white noise field, the statistical test is obtained by finding 
the 2-D frequency where the magnitude of the 2»D DFT of 
the processed data vector (organized back into a 2-D array) 
is maximal, followed by comparison of the value of the test 
statistic at this frequency against the threshold. In the more 
general case, where the purely indeterministic component of 
the field is not a white noise field, the observed data vector 
is first prewhitened by the estimated Tp^ 2 K It is shown in 
[13] that the probability density function of the GLR test that 
upper bounds the performance of the actual detector is x 2 with 
two degrees of freedom under Ho and noncentral x 2 w ith ^ 0 
degrees of freedom under H\ . 

As an example, consider the same field as in the previous sec- 
tion. Due to the strong interference component, the presence of 
the three targets is difficult to detect in the observed data, whose 
power spectral density is depicted in Fig. 3. However, these 
targets are easily detected in the processed data, as illustrated 
in Fig. 5. This result is obtained without estimating the para- 
metric model of the evanescent field 1-D modulating process 
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Fig. 5. Test statistic of the parametric partially adaptive processor. The power 
spectral density of the field after being projected onto the subspace orthogonal 
to the interference subspace. 

and, hence, without estimating the interference-plus-noise co- 
variance matrix. Since both the estimation of the interference- 
plus-noise covariance matrix, as well as its analysis, are saved, 
the proposed parametric partially adaptive processing method 
is robust and computationally attractive (see [13] for a detailed 
performance analysis and additional examples). 

VIII. Conclusions 

In this paper, a novel parametric approach for modeling, 
estimation, and target detection for STAP data has been 
derived. The proposed parametric interference mitigation 
procedures employ the information in only a single range gate, 
thus achieving high performance gain when the data in the 
different range gates cannot be assumed stationary. The model 
is based on the results of the 2-D Wold-like decomposition. 
We showed that the same parametric model that results from 
the 2-D Wold-lik e o rt hogonal decomposition naturally arises 
as the physical model in the problem of space-time processing 
of airborne radar data. We exploited this correspondence to 
derive computationally efficient fully adaptive and partially 
adaptive detection algorithms. Having estimated the models 
of the noise and interference components of the field, the 
estimated parameters are substituted into the parametric ex- 
pression of the covariance matrix to obtain an estimate of the 
interference-plus-noise covariance matrix. Hence, the fully 
adaptive weight vector is obtained, and a corresponding test is 
derived. Moreover, we proved that it is sufficient to estimate 
only the spectral support parameters of each interference com- 
ponent in order to obtain a projection matrix onto the subspace 
orthogonal to the interference subspace. Thus, the resulting 
detector is statistically superior to the fully adaptive detector 
as considerably fewer parameters need to be estimated. Since 
a much smaller number of parameters need to be estimated the 
proposed partially adaptive detector is also computationally 



much simpler. Statistical analysis of the performance of the 
proposed detectors is considered in [13]. 

appendix 
Linear Diophantine Equation 

Let k and I be two nonzero integers and p some other integer. 
The equation 

/;;;/; - £y = p 

is called the linear Diophantine equation. A solution of this 
equation is a pair (.x, y) of integers (a lattice point in the plane) 
that satisfies the equation. We use the following well known the- 
orem (e.g., see [23]) 

Theorem J: The linear Diophantine equation 

hx - ly = p 

has a solution if and only if q \ p, (i.e., q divides p), where 
q = g.c.d.(k> I). Furthermore, if (x 0 , yo) is a solution of this 
equation, then the set of solutions of the equation consists of all 
integer pairs (.x, y) of the form 

P. k 
x = x 0 + t- and y = y {) + t-, t e Z. (41) 

Note that if A; and £ are co prime, then there will always be solu- 
tions, given by (41). 
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Abstract 

In this paper we consider subspace approximation tailored 
to adaptive airborne radar. Motivation for this research 
includes the need for reduced computational burden and 
approaches for practical implementation. Measured radar 
data only approximately satisfies the statistical assumptions 
intrinsic to the adaptive processor. Hence, approximate 
numerical methods for adaptive weight computation may 
successfully be used in place of exact methods. We propose 
a numerical procedure based on partial bi-diagonalization 
of the interference covariance matrix, coupled with a pre- 
conditioned conjugate gradient iterative method, to extract 
approximate basis vectors for the interference subspace. 
We use these basis vectors to construct an adaptive weight 
vector. Through example, we show the potential of this 
method for adaptive radar. 



the sequence of required operations amounts to a singular 
value decomposition (SVD) of the data matrix. 
Nevertheless, such approaches leverage the numerical 
stability of the SVD and require minimal sample support for 
covariance matrix estimation. These considerations are 
important when applying STAP in routinely encountered 
inhomogeneous and sample-poor signal environments. 

In this paper we propose approximate numerical 
techniques as an alternative approach for the airborne radar 
STAP problem. The proposed approach capitalizes on: (1) 
the generally low numerical rank of the covariance matrix; 
(2) the realization that actual airborne radar signal 
environments only approximately satisfy assumed statistical 
models; and, (3) low-rank realizations require smaller 
sample populations for parameter estimation. 

2. Adaptive Airborne Radar 



1. Introduction 

Airborne radar must detect targets of diminishing radar 
cross-section, often at decreased radial velocities. The 
interference environment is severe and poses a significant 
challenge to effective target detection. Exploiting signal 
diversity over multiple domains offers enhanced detection 
performance. Space-time adaptive processing (STAP) 
reprints a class of multi-domain adaptive techniques 
useful in such circumstances. The theory of adaptive radar 
was proposed in a series of papers by Brennan, Mallett and 
Reed in the early 1970's [1-2]. STAP remains an active 
research topic and most view advanced STAP techniques as 
a vital component of future airborne and spaceborne radar 
systems. 

Recent STAP research focuses largely on mitigating 
computational burden and/or proposing novel processing 
architectures [3-8], Eigenbased STAP, also called reduced- 
rank or partially adaptive STAP methods, rely on a sequence 
of linear transformations and selection operators to generate 
a set of adaptive weights [5-8]. Such methods approach 
optimum performance, but at the expense of considerably 
increased computational complexity. This occurs because 



Consider an M channel, aircraft-mounted array receiving 
W pulses. The space-time snapshot, x € C m is given by 




with x^ k m fj) representing a complex baseband observation 
at the k th range, m ,h channel and n lh pulse. Each snapshot 
consists of additive signal contributions from clut t er, X k c , 
jamming, X k Jt uncoirelated noise, X k N , and targets, X k r , 
such that 

x = <| ft = Xk : c + Xk / J + Xk / N (2) 

k \ X kJH, = X kX + X k,J + X k.N + X k,T 

for the null and alternative hypotheses. The output of the 
adaptive processor is given by 

y> - . (3) 

where W k e C h ™ is the adaptive weight vector. This 
weight vector takes the general form 



W k - & k R-' S T 



K m » 1 



(4) 



where S T e C h 



is the target space-time steering vector, 
R k is the maximum likelihood estimate (MLE) of the 
interference covariance matrix [2], and is a constant. 
The space-time steering vector represents the response of the 
array to a point source with a specific direction of arrival 
and Doppler frequency. Note that R k approximates 
E[X m Xy H ]. We subsequently subject y k to binary 
hypothesis testing as a means of declaring target presence. 

3. Eigenbased Adaptive Radar 

A 

The sample covariance matrix R k is Hermitian and 
positive definite. For this reason, a unitary matrix Q k exists 
such that 

MN 

R k - Q k \ Qk - £ V«) ik(™) . (5) 



where X k (l) z X k (2) * ... :> X k (MN) y q k (m) isthem th 
column of Q k and I m is the MN x MN identity matrix. 
X k (m) and q k Qn) are the nv* eigenvalue and eigenvector of 
R k> respectively. Decomposing (5) into principal 
components (PC) and noise yields 

(6) 

X k (l) * X k (2) * ... > X k {P) , 

and 



fiL..*-[^*l).^*2),..., g4 (MV)] 

\ k (P) » X k (P* 1) « X t (P.2) « ... * ^(AflV) . 



(7) 



Observe that k x k - The principal and noise 
components define the interference and noise subspaces. 
An interesting result in [5] shows that explicit knowledge 



interpretations. For example, the two methods proposed in 
[7] rely on constructing a weight vector lying in the noise 
subspace. Since the noise subspace is orthogonal to the 
correlated interference, selecting a weight vector W k such 
that W. e span(Q . ,) cancels correlated interference. 
The cross-spectral metric (CSM) method discussed in [8] 
applies a cost function to the problem of choosing the best 
low-rank eigenbasis. In this case, the weight vector 
generally spans the noise subspace as seen through the 
following alternative view of the CSM method. From (2)- 
(5), the signal-to-interference plus noise ratio (SINR) may 
be written for a normalized target signal as 



SINR 



St Rl 



NM 

E 



St g» I 
X k (m) 



(9) 



Accordingly, with maximum SINR as the objective, a low- 
rank basis selection should choose those eigen-components 
which maximize the partial sum of terms in (9). The desired 
target response influences this selection. For example, 
consider the top plot in Figure 1 showing a typical 
eigenspectra for a simulated airborne radar clutter 
covariance matrix. In contrast, the bottom plot in the figure 
shows the individual n CSM n terms, | Sj q k (m) \ 2 /X k (m) 9 
as defined in (9). It is evident from this example that terms 
with the largest CSM predominantly lie in the noise 
subspace. 

4. Subspace Processing 

The preceding section is central to the development of 
numerical approximations suited to the STAP problem. As 
just discussed, three choices emerge for constructing the 
reduced-rank weight vector: 1) W k lies in the interference 
subspace; 2) W k lies in the orthogonal noise subspace; or 
3) W k uses those basis vectors which optimize an objective 
function. If the basis vectors arise from the SVD, it is 
sensible to construct W k from all principal components. 
Complications may arise if the interference rank is fuzzy. 

An advantage of subspace processing is reduced sample 



of the true interterence subspace completely solves the 
optimum filtering problem. Assuming matched channels 
and letting a k ^ <t k = 1 , we may write (4) as 



s upport requirements for covariance matrix estimation . 
Generally, the airborne radar signal environment is 
inhomogeneous (eg., spatially varying clutter and discretes) 
and non-stationary (data is coherent over a limited time 
interval). These factors limit useful sample data. Subspace 
approaches appear robust in such cases and afford the 
possibility of localized adaptive processing schemes. To 
corroborate this notion, we offer Figure 2. This figure 
depicts the actual eigenvalues of a jammer covariance matrix 
(three jammers present) for a 16-element linear array. The 
top plot shows the principal components, whereas the 
bottom plot depicts the noise eigenvalues. Also included in 
both plots are the corresponding eigenvalues for the sample 
covariance matrix estimated from sample support using lx, 
2x and lOx the total degrees of freedom (DOF). The figure 
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where W k denotes the optimum weight vector, X Q represents 
the noise floor, and m « q k (m) S T is the projection of 
the nr* interference eigenvector onto the quiescient 
response. Observe that terms associated with the noise 
eigenvalues do not affect the optimum weight vector. This 
same observation is made in [6]. 

Other researchers provide alternative eigenbased 



shows the robustness of the estimation process in 
representing the interference subspace, whereas the full-rank 
uncorrected noise is poorly estimated. The interference 
eigenvectors are also usually well represented with limited 
sample support. 

Appealing aspects of subspace methods, specifically 
convergence to optimum performance and mitigated sample 
support, is offset by computational burden. Taking into 
account that actual measured data only approximately 
satisfies the statistical assumption of homogeneous 
observations, we find sufficient motivation for exploring 
approximate subspace methods. 

Several other notions influence the development of such 
approximation procedures. First of all, we point out that 
q k e spaniSjip)}, where iSj(p)} is the set of 
interference space-time steering vectors. For high 
interference-to-noise terms, the eigenbeams generally point 
in a single direction. However, the discrete Fourier 
transform (DFT) of the eigenvectors can produce results 
appearing like a difference pattern when interferers appear 
closely spaced, as is true for ground clutter returns. Thus, a 
linear combination of the Sj (p) comprise each eigenvector 
and multiple terms can dominate. Secondly, asymptotic 
equivalence exists between Toeplitz and circulant matrices 
[9]. Ideal co variance matrices are Toeplitz, whereas sample 
covariance matrices are non-Toeplitz. Nevertheless, this 
implies the columns of a discrete cosine transform (DCT) or 
DFT may serve as appropriate surrogates for the actual 
eigenvectors of the covariance matrix. (DCT and DFT 
vectors are eigenvectors of circulant matrices.) Let G be a 
unitary matrix whose columns are selected DCT or DFT 
vectors, D be a diagonal matrix approximating the principal 
eigenvalues and E represent the residual off-diagonal terms. 
In the case of closely spaced interferers, we generally find 
that 

\\R - G D G"I, (10) 

where || • || is an appropriate norm, is unacceptably large. 

A deterministic basis may not adequately diagonalize R k \ 
An adaptive basis, generated via efficient numerical 
routines, offers the potential for better performance at 
modest cost. 

5. Numerical Subspace Approximation 

We may interpret STAP as the constrained minimization 
of output power subject to a linear constraint, 

min W k R k W k such that W k S r = 1 . (H) 
Define the data matrix of snapshots from (1) as 



Upon substituting the MLE for R k , (11) equates to the 
linear least squares problem (LLSP), 

min 1! X W k || 2 such that S? W k - 1 . (13) 

Assume that \\S T \\ 2 = 1 and let H S r = where e n is 
the n 11 ' column of the appropriately dimensioned identity 
matrix. Furthermore, let H W k - u = hi . We then 
express (13) as, 

mm\\XH H HW k \\ 2 such that S"H H HW k ^\ (14) 

from which we get the unconstrained form 
p = min |Z B ., v ♦ rj| 2 . 

(15) 

Z.^^[Z B ,: > ];v.(^;' ]J 

where n = NM. For a linear solver based on a direct QR- 
decomposition, the cost of finding v is 0(N 3 M 3 ) withZ 
= 2xDOF. The required computational rate can be very 
large. 

5.1 Low-Rank Approximation of the Data Matrix 

The singular values of the data matrix relate closely to the 
eigenvalues of R k . When Z is of low numerical rank, as is 
often the case in airborne radar [3, 5-8], we can approximate 
(15) by another minimization problem in a lower dimension 
subspace span(Y) c span(X) for which minimizer p* of 

min || Yp || such that S 7 n p =1 (16) 

ideally yields small residuals for X p' except at target 
locations. If Y adequately represents the dominant subspace 
of X, and if the dimension of Y is considerably smaller than 
dim (X) , significant savings in computational cost result 
without sacrificing target detection performance. 

The discussion of eigenbased methods suggests selecting Y 
via the dominant left singular vectors of the SVD, 
X = U S V H [5-6]. As already mentioned, the SVD is a 
very costly approach to rank-reduction. However, we may 
obtain a suitable low-rank approximation through bi- 
di agonal izati on of X. Furthermore, we do not have to 
complete the bi-diagonalization to extract meaningful 
information regarding the dominant subspace [10]. One 
accomplishes bi-diagonalization via sequences of left and 
right Householder transformations, U and V 9 applied 
directly to X such that B = U H X V is upper (or lower) 
bi-diagonal. After k steps of bi-diagonalization, it is with 
high probability that the singular values of B k = B(l:k,l:k) 
tend to be very good approximations to the largest singular 
values of x . Thus, when X is of low numerical rank we 
expect 
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5.4 Computational Cost 



to be close to the best rank-A: approximation of X. 

5.2 LLSP in Lower Dimensional Space 

Consider the following subspace approximation, used in 
effect as a rank-reducing transformation to alleviate 
computational complexity. First, bi-diagonalize X until an 
appreciable gap is found between the 2-norms of column k 
of 5 ( *>and column k+\ of B (k * l \ Next, solve (16) in the 
subspace spanned by U kA alone, 



min I B kA p \\ 2 such that 



s"p-\ 



(18) 



where S T = V kA S T and p= V kA p. Note, the unitary 
operator U preserves measure and we select B kA after its 
removal. A solution for p follows by placing the 
constrained minimization into an unconstrained form similar 
to (15). One can develop a recursive relation for p to 
efficiently accomodate expanded approximate subspace 

dimension. Observe that P = V [fi** 0 ] . 

5.3 Refinement Via Conjugate Gradient 

From (8) it is seen that the optimum weight vector 
satisfies W k e span^S^Q^ k ). The formulation based on 
partial bi-diagonalization extracts information frornthe data 
matrix such that W k e span(S T ,Q D ) where Q D is an 
approximation to the dominant subspace. As expected, 
other sources of correlated interference not represented by 
Q D influence target detection. When choosing a low 
dimension subspace, it is not necessarily true that the 
dominant terms most greatly influence performance. One 
must consider the cost function when ranking the importance 
of each subspace [8]. A pre-conditioned conjugate gradient 
(CG) iterative method applied to the partially bi- 
diagonalized system of equations allows us to refine the 
weight vector produced from the unconstrained 
minimization in the dominant subspace alone. This 
additional step effectively expands the approximate weight 
vector to include basis vectors representing the weaker 
sources of correlated interference. We use [B k ; / ] as 
an (n-l)x(n-l) preconditioner, where approximates the 
smallest singular values of B k . The pre-conditioner reduces 
singular value spread of the partially bi-diagonal system. It 
is known that convergence is very good for the CG method 
when the data matrix is well conditioned [11]. Thus, CG 
iterations are applicable in this instance. The starting weight 
vector approximation is [p ; 0 ] . 



The computational cost of the partial bi-diagonalization 
(step 1) is 0(L*NM*k) 7 with k the number of bi- 
diagonalization steps. The pre-conditioned CG iterations 
(step 2) add 0(L*NM*G c ) computations, where G c is the 
number of iterations. 
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Figure 1. Eigenspectra and CSM terms. 



Eigenvalue Estimation, Linear Array. Three Jammers 











*- 


* Actual 








o 1*DOF 








* 2*DOF 


i 






f' 10*DOF 











* Actual 
o T D QF 



v 2*DOF 
L> i> 10*DOF 



12 14 



Eigenvalue Number 



Figure 2. Robustness of dominant eigenvalues to low 
sample support. 

6. Example 

To validate the proposed method, consider the case of a 
16 channel airborne linear array receiving 12 pulses. A 
weak target, positioned close to mainbeam clutter in Doppler 
space, is present in the 34 th range bin (realization). Figure 
3 shows the optimum filter output versus realization, 



whereas Figure 4 shows the adaptive filter output using the 
conventional sample matrix inversion (SMI) [2], In 
contrast, Figure 5 shows the result using the subspace 
approximation of partial bi-diagonalization followed by the 
pre-conditioned, unconstrained CG method. The order of 
the partial bi-diagonalization was k=\6 and we used 
G c = 1 8 CG iterations. Observe from the figures that all 
three methods detect the target. Interestingly enough, the 
approximate method gives the best results. We attribute this 
to improved numerical stability of the procedure in 
comparison to the numerical routines used to invert the 
covariance matrix for the optimum and SMI scenarios. 
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Figure 3. Optimum filter response. 
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serve as motivation for our pursuit of this topic. The 
approximation involves partial bi-diagonalization and pre- 
conditioned conjugate gradient iterations to mitigate 
computational burden. A simple example shows that this 
approach has merit and warrants further consideration. 
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Figure 5. Unconstrained pre-conditioned CG method. 
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Figure 4. SMI adaptive filter response. 
7. Summary 

In this paper we propose an approximate subspace 
procedure suited to airborne radar STAP application. 
Recent eigenbased methods proposed by other researchers 
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